[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2016-11-21 Thread Raymond Hettinger
Changes by Raymond Hettinger : -- assignee: rhettinger -> ___ Python tracker ___

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2016-11-21 Thread Mark Lawrence
Changes by Mark Lawrence : -- nosy: -BreamoreBoy ___ Python tracker ___ ___

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2016-11-20 Thread Xiang Zhang
Changes by Xiang Zhang : -- versions: +Python 3.7 -Python 3.5 ___ Python tracker ___ ___

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2014-06-22 Thread Mark Lawrence
Mark Lawrence added the comment: The code given in msg183579 works perfectly in 3.4.1 and 3.5.0. Is there anything to fix here whether code or docs? -- nosy: +BreamoreBoy ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15851

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2014-06-22 Thread karl
karl added the comment: Mark, The code is using urllib for demonstrating the issue with wikipedia and other sites which are blocking python-urllib user agents because it is used by many spam harvesters. The proposal is about giving a possibility in robotparser lib to add a feature for

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2014-06-22 Thread karl
karl added the comment: Note that one of the proposal is to just document in https://docs.python.org/3/library/urllib.robotparser.html the proposal made in msg169722 (available in 3.4+) robotparser.URLopener.version = 'MyVersion' -- ___ Python

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2014-06-22 Thread Mark Lawrence
Mark Lawrence added the comment: c:\cpython\PCbuildpython_d.exe -V Python 3.5.0a0 c:\cpython\PCbuildtype C:\Users\Mark\MyPython\mytest.py #!/usr/bin/env python3 # -*- coding: latin-1 -*- import urllib.request opener = urllib.request.build_opener() opener.addheaders = [('User-agent',

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2014-06-22 Thread Raymond Hettinger
Changes by Raymond Hettinger raymond.hettin...@gmail.com: -- assignee: - rhettinger nosy: +rhettinger versions: +Python 3.5 -Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15851

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2014-06-22 Thread karl
karl added the comment: → python Python 2.7.5 (default, Mar 9 2014, 22:15:05) [GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] on darwin Type help, copyright, credits or license for more information. import robotparser rp =

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2014-06-22 Thread Raymond Hettinger
Changes by Raymond Hettinger raymond.hettin...@gmail.com: -- versions: +Python 3.5 -Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15851 ___

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2013-03-11 Thread Tshepang Lekhonkhobe
Changes by Tshepang Lekhonkhobe tshep...@gmail.com: -- nosy: +tshepang ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15851 ___ ___

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2013-03-05 Thread karl
karl added the comment: Setting a user agent string should be possible. My guess is that the default library has been used by an abusive client (by mistake or intent) and wikimedia project has decided to blacklist the client based on the user-agent string sniffing. The match is on anything

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2012-09-11 Thread Senthil Kumaran
Senthil Kumaran added the comment: Hi Eduardo, I tested further and do observe some very strange oddities. On Mon, Sep 10, 2012 at 10:45 PM, Eduardo A. Bustamante López rep...@bugs.python.org wrote: Also, I'm aware that you shouldn't normally worry about setting a specific user-agent to

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2012-09-10 Thread Senthil Kumaran
Senthil Kumaran added the comment: Hello Eduardo, I fail to see the bug in here. Robotparser module is for reading and parsing the robot.txt file, the module responsible for fetching it could urllib. robots.txt is always available from web-server and you can download the robot.txt by any means,

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2012-09-10 Thread Eduardo A . Bustamante López
Eduardo A. Bustamante López added the comment: Hi Senthil, I fail to see the bug in here. Robotparser module is for reading and parsing the robot.txt file, the module responsible for fetching it could urllib. You're right, but robotparser's read() does a call to urllib.request.urlopen to

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2012-09-09 Thread Eduardo A . Bustamante López
Eduardo A. Bustamante López added the comment: I'm not sure what's the best approach here. 1. Avoid changes in the Lib, and document a work-around, which involves installing an opener with the specific User-agent. The draw-back is that it modifies the behaviour of urlopen() globally, so

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2012-09-09 Thread Eduardo A . Bustamante López
Eduardo A. Bustamante López added the comment: I forgot to mention that I ran a nc process in parallel, to see what data is being sent: ``nc -l -p ``. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15851

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2012-09-08 Thread Ezio Melotti
Changes by Ezio Melotti ezio.melo...@gmail.com: -- nosy: +ezio.melotti ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15851 ___ ___

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2012-09-07 Thread Terry J. Reedy
Terry J. Reedy added the comment: Enhancements can only be targeted at 3.4, where robotparser is now urllib.robotparser I wonder if documenting the simple solution would be sufficient. -- nosy: +orsenthil, terry.reedy versions: +Python 3.4 -Python 2.7

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2012-09-07 Thread Terry J. Reedy
Terry J. Reedy added the comment: In any case, a doc change *could* go in 2.7 and 3.3/2. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15851 ___

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2012-09-02 Thread Eduardo A . Bustamante López
New submission from Eduardo A. Bustamante López: I found that http://en.wikipedia.org/robots.txt returns 403 if the provided user agent is in a specific blacklist. And since robotparser doesn't provide a mechanism to change the default user agent used by the opener, it becomes unusable for

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2012-09-02 Thread Eduardo A . Bustamante López
Changes by Eduardo A. Bustamante López dual...@gmail.com: Added file: http://bugs.python.org/file27101/myrobotparser.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15851 ___

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2012-09-02 Thread Eduardo A . Bustamante López
Eduardo A. Bustamante López added the comment: I guess a workaround is to do: robotparser.URLopener.version = 'MyVersion' -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15851 ___