[issue24954] No way to generate or parse timezone as produced by datetime.isoformat()
John Nagle added the comment: As the original author of the predecessor bug report (issue 15873) in 2012, I would suggest that there's too much bikeshedding here. I filed this bug because there was no usable ISO8601 date parser available. PyPi contained four slightly different buggy ones, and three more versions were found later. I suggested following RFC3339, "Date and Time on the Internet: Timestamps", section 5.6, which specifies a clear subset of ISO8601. Five years later, I suggest just going with that. Fancier variations belong in non-standard libraries. Date parsing should not be platform-dependent. Using an available C library was convenient, but not portable. Let's get this done. -- ___ Python tracker <http://bugs.python.org/issue24954> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28756] robotfileparser always uses default Python user-agent
John Nagle added the comment: (That's from a subclass I wrote. As a change to RobotFileParser, __init__ should start like this.) def __init__(self, url='', user_agent=None): self.user_agent = user_agent# save user agent ... -- ___ Python tracker <http://bugs.python.org/issue28756> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28756] robotfileparser always uses default Python user-agent
John Nagle added the comment: Suggest adding a user_agent optional parameter, as shown here: def __init__(self, url='', user_agent=None): urllib.robotparser.RobotFileParser.__init__(self, url) # init parent self.user_agent = user_agent# save user agent def read(self): """ Reads the robots.txt URL and feeds it to the parser. Overrides parent read function. """ try: req = urllib.request.Request( # request with user agent specified self.url, data=None) if self.user_agent is not None :# if overriding user agent req.add_header("User-Agent", self.user_agent) f = urllib.request.urlopen(req) # open connection except urllib.error.HTTPError as err: if err.code in (401, 403): self.disallow_all = True elif err.code >= 400 and err.code < 500: self.allow_all = True else: raw = f.read() self.parse(raw.decode("utf-8").splitlines()) -- ___ Python tracker <http://bugs.python.org/issue28756> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28756] robotfileparser always uses default Python user-agent
New submission from John Nagle: urllib.robotparser.RobotFileParser always uses the default Python user agent. This agent is now blacklisted by many sites, and it's not possible to read the robots.txt file at all. -- components: Library (Lib) messages: 281314 nosy: nagle priority: normal severity: normal status: open title: robotfileparser always uses default Python user-agent type: enhancement versions: Python 2.7, Python 3.3, Python 3.4, Python 3.5, Python 3.6, Python 3.7 ___ Python tracker <http://bugs.python.org/issue28756> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27065] robotparser user agent considered hostile by mod_security rules.
New submission from John Nagle: "robotparser" uses the default Python user agent when reading the "robots.txt" file, and there's no parameter for changing that. Unfortunately, the "mod_security" add-on for Apache web server, when used with the standard OWASP rule set, blacklists the default Python USER-AGENT in Rule 990002, User Agent Identification. It doesn't like certain HTTP USER-AGENT values. One of them is "python-httplib2". So any program in Python which accesses the web site will trigger this rule and be blocked form access. For regular HTTP accesses, it's possible to put a user agent string in the Request object and work around this. But "robotparser" has no such option. Worse, if "robotparser" has its read of "robots.txt" rejected, it interprets that as a "deny all" robots.txt file, and returns False for all "can_fetch()" requests. -- components: Library (Lib) messages: 265900 nosy: nagle priority: normal severity: normal status: open title: robotparser user agent considered hostile by mod_security rules. type: behavior versions: Python 2.7, Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 3.6 ___ Python tracker <http://bugs.python.org/issue27065> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24985] Python install test fails - OpenSSL - "dh key too small"
New submission from John Nagle: Installing Python 3.4.3 on a new CentOS Linux release 7.1.1503 server. Started with source tarball, did usual ./configure; make; make test SSL test fails with "dh key too small". See below. OpenSSL has recently been modified to reject short keys, due to a security vulnerability. See http://www.ubuntu.com/usn/usn-2639-1/ and see here for an analysis of the issue on a Python install: http://www.alexrhino.net/jekyll/update/2015/07/14/dh-params-test-fail.html Apparently the "dh512.pem" file in the test suite is now obsolete, because the minimum length dh key is now 768. The question is, does this break anything else? Google for "dh key too small" and various other projects report problems. == ERROR: test_dh_params (test.test_ssl.ThreadedTests) -- Traceback (most recent call last): File "/home/sitetruth/private/downloads/python/Python-3.4.3/Lib/test/test_ssl. py", line 2728, in test_dh_params chatty=True, connectionchatty=True) File "/home/sitetruth/private/downloads/python/Python-3.4.3/Lib/test/test_ssl. py", line 1866, in server_params_test s.connect((HOST, server.port)) File "/home/sitetruth/private/downloads/python/Python-3.4.3/Lib/ssl.py", line 846, in connect self._real_connect(addr, False) File "/home/sitetruth/private/downloads/python/Python-3.4.3/Lib/ssl.py", line 837, in _real_connect self.do_handshake() File "/home/sitetruth/private/downloads/python/Python-3.4.3/Lib/ssl.py", line 810, in do_handshake self._sslobj.do_handshake() ssl.SSLError: [SSL: SSL_NEGATIVE_LENGTH] dh key too small (_ssl.c:600) -- Ran 99 tests in 12.012s FAILED (errors=1, skipped=4) test test_ssl failed make: *** [test] Error 1 == -- components: Installation messages: 249566 nosy: nagle priority: normal severity: normal status: open title: Python install test fails - OpenSSL - "dh key too small" versions: Python 3.4 ___ Python tracker <http://bugs.python.org/issue24985> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23843] ssl.wrap_socket doesn't handle virtual TLS hosts
John Nagle added the comment: I'm using wrap_socket because I want to read the details of a server's SSL certificate. "Starting from Python 3.2, it can be more flexible to use SSLContext.wrap_socket() instead" does not convey that ssl.wrap_socket() will fail to connect to some servers because it will silently check the wrong certificate. -- ___ Python tracker <http://bugs.python.org/issue23843> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23588] Errno conflicts in ssl.SSLError
John Nagle added the comment: If SSL error reporting is getting some attention, something should be done to provide better text messages for the SSL errors. All certificate verify exceptions return the string "certificate verify failed (_ssl.c:581)". The line number in _ssl.c is not particularly useful to end users. OpenSSL has more specific messages, but they're not making it to the Python level. 'python ssl "certificate verify failed"' has 17,000 hits in Google, which indicates users need more help dealing with this class of error. -- nosy: +nagle ___ Python tracker <http://bugs.python.org/issue23588> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23843] ssl.wrap_socket doesn't handle virtual TLS hosts
New submission from John Nagle: ssl.wrap_socket() always uses the SSL certificate associated with the raw IP address, rather than using the server_host feature of TLS. Even when wrap_socket is used before calling "connect(port, host)", the "host" parameter isn't used by TLS. To get proper TLS behavior (which only works in recent Python versions), it's necessary to create an SSLContext, then use context.wrap_socket(sock, server_hostname="example.com") This behavior is backwards-compatible (the SSL module didn't talk TLS until very recently) but confusing. The documentation does not reflect this difference. There's a lot of old code and online advice which suggests using ssl.wrap_socket(). It works until you hit a virtual host with TLS support. Then you get the wrong server cert and an unexpected "wrong host" SSL error. Possible fixes: 1. Deprecate ssl.wrap_socket(), and modify the documentation to tell users to always use context.wrap_socket(). 2. Add a "server_hostname" parameter to ssl.wrap_socket(). It doesn't accept that parameter; only context.wrap_socket() does. Modify documentation accordingly. -- assignee: docs@python components: Documentation, Library (Lib) messages: 239834 nosy: docs@python, nagle priority: normal severity: normal status: open title: ssl.wrap_socket doesn't handle virtual TLS hosts versions: Python 3.4 ___ Python tracker <http://bugs.python.org/issue23843> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23736] "make test" on clean py3 install on CentOS 6.2 - 2 tests fail
New submission from John Nagle: Installing Python 3.4.2 on CentOS 6. Clean install. Using procedure in README file: ./configure make make test 2 tests fail in "make test" The first one is because the FTP client test is trying to test against a site that is long gone, the Digital Equipment Corporation Systems Research Center in Palo Alto: ERROR: test_ftp (test.test_urllib2net.OtherNetworkTests) (url='ftp://gatekeeper.research.compaq.com/pub/DEC/SRC/research-reports/00README-Legal-Rules-Regs') -- Traceback (most recent call last): File "/home/staging/local/python/Python-3.4.3/Lib/urllib/request.py", line 1399, in ftp_open fw = self.connect_ftp(user, passwd, host, port, dirs, req.timeout) File "/home/staging/local/python/Python-3.4.3/Lib/urllib/request.py", line 1445, in connect_ftp dirs, timeout) File "/home/staging/local/python/Python-3.4.3/Lib/urllib/request.py", line 2243, in __init__ self.init() File "/home/staging/local/python/Python-3.4.3/Lib/urllib/request.py", line 2249, in init self.ftp.connect(self.host, self.port, self.timeout) File "/home/staging/local/python/Python-3.4.3/Lib/ftplib.py", line 153, in connect source_address=self.source_address) File "/home/staging/local/python/Python-3.4.3/Lib/socket.py", line 512, in create_connection raise err File "/home/staging/local/python/Python-3.4.3/Lib/socket.py", line 503, in create_connection sock.connect(sa) TimeoutError: [Errno 110] Connection timed out The second one is failing because "readline" (probably GNU readline) didn't behave as expected. The installed GCC is "gcc (GCC) 4.4.6 20110731 (Red Hat 4.4.6-3)", which came with "CentOS release 6.2 (Final)". This is a long-running production server. Is that too old? FAIL: test_init (test.test_readline.TestReadline) -- Traceback (most recent call last): File "/home/staging/local/python/Python-3.4.3/Lib/test/test_readline.py", line 57, in test_init self.assertEqual(stdout, b'') AssertionError: b'\x1b[?1034h' != b'' -- components: Installation messages: 238869 nosy: nagle priority: normal severity: normal status: open title: "make test" on clean py3 install on CentOS 6.2 - 2 tests fail versions: Python 3.4 ___ Python tracker <http://bugs.python.org/issue23736> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23655] Memory corruption using pickle over pipe to subprocess
John Nagle added the comment: More info: the problem is on the "unpickle" side. If I use _Unpickle and Pickle, so the unpickle side is in Python, but the pickle side is in C, no problem. If I use Unpickle and _Pickle, so the unpickle side is C, crashes. -- ___ Python tracker <http://bugs.python.org/issue23655> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23655] Memory corruption using pickle over pipe to subprocess
John Nagle added the comment: "minimize you data" - that's a big job here. Where are the tests for "pickle"? Is there one that talks to a subprocess over a pipe? Maybe I can adapt that. -- ___ Python tracker <http://bugs.python.org/issue23655> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23655] Memory corruption using pickle over pipe to subprocess
John Nagle added the comment: > Or just use pickle._Pickler instead of pickle.Pickler and like > (implementation detail!). Tried that. Changed my own code as follows: 25a26 > 71,72c72,73 < self.reader = pickle.Unpickler(self.proc.stdout)# set up reader < self.writer = pickle.Pickler(self.proc.stdin,kpickleprotocolversion) --- > self.reader = pickle._Unpickler(self.proc.stdout)# set up reader > self.writer = pickle._Pickler(self.proc.stdin,kpickleprotocolversion 125,126c126,127 < self.reader = pickle.Unpickler(self.datain) # set up reader < self.writer = pickle.Pickler(self.dataout,kpickleprotocolversion) --- > self.reader = pickle._Unpickler(self.datain) # set up reader > self.writer = pickle._Pickler(self.dataout,kpickleprotocolversion) Program runs after those changes. So it looks like CPickle has a serious memory corruption problem. -- ___ Python tracker <http://bugs.python.org/issue23655> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23655] Memory corruption using pickle over pipe to subprocess
New submission from John Nagle: I'm porting a large, working system from Python 2 to Python 3, using "six", so the same code works with both. One part of the system works a lot like the multiprocessing module, but predates it. It launches child processes with "Popen" and talks to them using "pickle" over stdin/stdout as pipes. Works fine under Python 2, and has been working in production for years. Under Python 3, I'm getting errors that indicate memory corruption: Fatal Python error: GC object already tracked Current thread 0x1a14 (most recent call first): File "C:\python34\lib\site-packages\pymysql\connections.py", line 411 in description File "C:\python34\lib\site-packages\pymysql\connections.py", line 1248 in _get_descriptions File "C:\python34\lib\site-packages\pymysql\connections.py", line 1182 in _read_result_packet File "C:\python34\lib\site-packages\pymysql\connections.py", line 1132 in read File "C:\python34\lib\site-packages\pymysql\connections.py", line 929 in _read_query_result File "C:\python34\lib\site-packages\pymysql\connections.py", line 768 in query File "C:\python34\lib\site-packages\pymysql\cursors.py", line 282 in _query File "C:\python34\lib\site-packages\pymysql\cursors.py", line 134 in execute File "C:\projects\sitetruth\domaincacheitem.py", line 128 in select File "C:\projects\sitetruth\domaincache.py", line 30 in search File "C:\projects\sitetruth\ratesite.py", line 31 in ratedomain File "C:\projects\sitetruth\RatingProcess.py", line 68 in call File "C:\projects\sitetruth\subprocesscall.py", line 140 in docall File "C:\projects\sitetruth\subprocesscall.py", line 158 in run File "C:\projects\sitetruth\RatingProcess.py", line 89 in main File "C:\projects\sitetruth\RatingProcess.py", line 95 in That's clear memory corruption. Also, File "C:\projects\sitetruth\InfoSiteRating.py", line 200, in scansite if len(self.badbusinessinfo) > 0 : # if bad stuff NameError: name 'len' is not defined There are others, but those two should be impossible to cause from Python source. I've done the obvious stuff - deleted all .pyc files and Python cache directories. All my code is in Python. Every library module came in via "pip", into a clean Python 3.4.3 (32 bit) installation on Win7/x86-64. Currently installed packages (via "pip list") beautifulsoup4 (4.3.2) dnspython3 (1.12.0) html5lib (0.999) pip (6.0.8) PyMySQL (0.6.6) pyparsing (2.0.3) setuptools (12.0.5) six (1.9.0) Nothing exotic there. The project has zero local C code; any C code came from the Python installation or the above packages, most of which are pure Python. It all works fine with Python 2.7.9. Everything else in the program seems to be working fine under both 2.7.9 and 3.4.3, until subprocesses are involved. What's being pickled is very simple; no custom objects, although Exception types are sometimes pickled if the subprocess raises an exception. Pickler and Unpickler instances are being reused here. A message is pickled, piped to the subprocess, unpickled, work is done, and a response comes back later via the return pipe. A send looks like: self.writer.dump(args) # send data self.dataout.flush()# finish output self.writer.clear_memo()# no memory from cycle to cycle and a receive looks like: result = self.reader.load() # read and return from child self.reader.memo = {} # no memory from cycle to cycle Those were the recommended way to reset "pickle" for new traffic years ago. (You have to clear the receive side as well as the send side, or the dictionary of saved objects grows forever.) My guess is that there's something about reusing "pickle" instances that botches memory uses in CPython 3's C code for "cpickle". That should work, though; the "multiprocessing" module works by sending pickled data over pipes. The only code difference between Python 2 and 3 is that under Python 3 I have to use "sys.stdin.buffer" and "sys.stdout.buffer" as arguments to Pickler and Unpickler. Otherwise they complain that they're getting type "str". Unfortunately, I don't have an easy way to reproduce this bug yet. Is there some way to force the use of the pure Python pickle module under Python 3? That would help isolate the problem. John Nagle -- components: Library (Lib) messages: 238009 nosy: nagle priority: normal severity: normal status: open title: Memory corruption using pickle over pipe to subprocess versions: Python 3.4 ___ Python tracker <http://bugs.python.org/issue23655> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue9679] unicode DNS names in urllib, urlopen
John Nagle added the comment: Three years later, I'm converting to Python 3. Did this get fixed in Python 3? -- ___ Python tracker <http://bugs.python.org/issue9679> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23476] SSL cert verify fail for "www.verisign.com"
John Nagle added the comment: Will this be applied to the Python 2.7.9 library as well? -- ___ Python tracker <http://bugs.python.org/issue23476> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23476] SSL cert verify fail for "www.verisign.com"
John Nagle added the comment: The "fix" in Ubuntu was to the Ubuntu certificate store, which is a directory tree with one cert per file, with lots of symbolic links with names based on hashes to express dependencies. Python's SSL isn't using that. Python is taking in one big text file of SSL certs, with no link structure, and feeding it to OpenSSL. This is an option at SSLContext.load_verify_locations(cafile=None, capath=None, cadata=None) I've been testing with "cafile". "capath" is a path to a set of preprocessed certs laid out like the Ubuntu certificate store. It may be that the directory parameter works but the single-file parameter does not. It's possible to create such a directory from a single .pem file by splitting the big file into smaller files (the suggested tool is an "awk" script) and then running "c_rehash", which comes with OpenSSL. See "https://www.openssl.org/docs/apps/c_rehash.html"; So I tried a workaround, using Python 3.4.0 and Ubuntu 14.04 LTS. I broke up "cacert.pem" into one file per cert with the suggested "awk" script, and used "c_rehash" to build all the links, creating a directory suitable for "capath". It didn't help. Fails for "verisign.com", works for "python.org" and "google.com", just like the original single-file test. The "capath" version did exactly the same thing as the "cafile" version. Python is definitely reading the cert file or directories; if I try an empty cert file or dir, everything fails, like it should. Tried the same thing on Win7 x64. Same result. Tried the command line openssl tool using the cert directory. Same results as with the single file on both platforms. So that's not it. A fix to OpenSSL was proposed in 2012, but no action was taken: http://rt.openssl.org/Ticket/Display.html?id=2732 at "Wed Jun 13 17:15:04 2012 Arne Becker - Correspondence added". Any ideas? -- ___ Python tracker <http://bugs.python.org/issue23476> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23476] SSL cert verify fail for "www.verisign.com"
John Nagle added the comment: To try this with the OpenSSL command line client, use this shell command: openssl s_client -connect www.verisign.com:443 -CAfile cacert.pem This provides more detailed error messages than Python provides. "verify error:num=20:unable to get local issuer certificate" is the OpenSSL error for "www.verisign.com". The corresponding Python error is "[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)." -- ___ Python tracker <http://bugs.python.org/issue23476> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23476] SSL cert verify fail for "www.verisign.com"
John Nagle added the comment: Add cert file for testing. Source of this file is http://curl.haxx.se/ca/cacert.pem -- Added file: http://bugs.python.org/file38166/cacert.pem ___ Python tracker <http://bugs.python.org/issue23476> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23476] SSL cert verify fail for "www.verisign.com"
New submission from John Nagle: SSL certificate verification fails for "www.verisign.com" when using the cert list from Firefox. Other sites ("google.com", "python.org") verify fine. This may be related to a known, and fixed, OpenSSL bug. See: http://rt.openssl.org/Ticket/Display.html?id=2732&user=guest&pass=guest https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1014640 Some versions of OpenSSL are known to be broken for cases where there multiple valid certificate trees. This happens when one root cert is being phased out in favor of another, and cross-signing is involved. Python ships with its own copy of OpenSSL on Windows. Tests for "www.verisign.com" Win7, x64: Python 2.7.9 with OpenSSL 1.0.1j 15 Oct 2014. FAIL Python 3.4.2 with OpenSSL 1.0.1i 6 Aug 2014. FAIL openssl s_client -OpenSSL 1.0.1h 5 Jun 2014 FAIL Ubuntu 14.04 LTS, x64, using distro's versions of Python: Python 2.7.6 - test won't run, needs create_default_context Python 3.4.0 with OpenSSL 1.0.1f 6 Jan 2014. FAIL openssl s_client OpenSSL 1.0.1f 6 Jan 2014 PASS That's with the same cert file in all cases. The OpenSSL version for Python programs comes from ssl.OPENSSL_VERSION. The Linux situation has me puzzled. On Linux, Python is supposedly using the system version of OpenSSL. The versions match. Why do Python and the OpenSSL command line client disagree? Different options passed to OpenSSL by Python? A simple test program and cert file are attached. Please try this in your environment. -- components: Library (Lib) files: ssltest.py messages: 236158 nosy: nagle priority: normal severity: normal status: open title: SSL cert verify fail for "www.verisign.com" versions: Python 2.7, Python 3.4 Added file: http://bugs.python.org/file38165/ssltest.py ___ Python tracker <http://bugs.python.org/issue23476> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20916] ssl.enum_certificates() will not return all certificates trusted by Windows
John Nagle added the comment: Amusingly, I'm getting this failure on "verisign.com" on Windows 7 with Python 2.7.9: "HTTP error - [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)..)" The current Verisign root cert (Class 3 public) is, indeed, not in the Windows 7 cert store. Verisign has a newer root cert. That error message ought to be improved. Tell the user which cert was rejected. "python.org", with a DigiCert certificate, works fine. I'm going to use the Mozilla certificate store explicitly. -- nosy: +nagle ___ Python tracker <http://bugs.python.org/issue20916> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22873] Re: SSLsocket.getpeercert - return ALL the fields of the certificate.
John Nagle added the comment: May be a duplicate of Issue 204679: "ssl.getpeercert() should include extensions" http://bugs.python.org/issue20469 -- ___ Python tracker <http://bugs.python.org/issue22873> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22873] Re: SSLsocket.getpeercert - return ALL the fields of the certificate.
New submission from John Nagle: In each revision of "getpeercert", a few more fields are returned. Python 3.2 added "issuer" and "notBefore". Python 3.4 added "crlDistributionPoints", "caIssuers", and OCSP URLS. But some fields still aren't returned. I happen to need CertificatePolicies, which is how you distinguish DV, OV, and EV certs. Here's what you get now from "getpeercert()" for "bankofamerica.com": {'OCSP': ('http://EVSecure-ocsp.verisign.com',), 'caIssuers': ('http://EVSecure-aia.verisign.com/EVSecure2006.cer',), 'crlDistributionPoints': ('http://EVSecure-crl.verisign.com/EVSecure2006.crl',), 'issuer': ((('countryName', 'US'),), (('organizationName', 'VeriSign, Inc.'),), (('organizationalUnitName', 'VeriSign Trust Network'),), (('organizationalUnitName', 'Terms of use at https://www.verisign.com/rpa (c)06'),), (('commonName', 'VeriSign Class 3 Extended Validation SSL CA'),)), 'notAfter': 'Mar 22 23:59:59 2015 GMT', 'notBefore': 'Feb 20 00:00:00 2014 GMT', 'serialNumber': '69A7BC85C106DDE1CF4FA47D5ED813DC', 'subject': ((('1.3.6.1.4.1.311.60.2.1.3', 'US'),), (('1.3.6.1.4.1.311.60.2.1.2', 'Delaware'),), (('businessCategory', 'Private Organization'),), (('serialNumber', '2927442'),), (('countryName', 'US'),), (('postalCode', '60603'),), (('stateOrProvinceName', 'Illinois'),), (('localityName', 'Chicago'),), (('streetAddress', '135 S La Salle St'),), (('organizationName', 'Bank of America Corporation'),), (('organizationalUnitName', 'Network Infrastructure'),), (('commonName', 'www.bankofamerica.com'),)), 'subjectAltName': (('DNS', 'mobile.bankofamerica.com'), ('DNS', 'www.bankofamerica.com')), 'version': 3} Missing fields (from Firefox's view of the cert) include: Certificate Policies: 2.16.840.1.113733.1.7.23.6: Extended Validation (EV) SSL Server Certificate Certification Practice Statement pointer: https://www.verisign.com/cps (This tells you it's a valid EV cert). Certificate Basic Constraints: Is not a Certificate Authority (which means they can't issue more certs below this one) Extended Key Usage: TLS Web Server Authentication (1.3.6.1.5.5.7.3.1) TLS Web Client Authentication (1.3.6.1.5.5.7.3.2) (which means this cert is for web use, not email or code signing) How about just returning ALL the remaining fields and finishing the job, so this doesn't have to be fixed again? Thanks. -- components: Library (Lib) messages: 231166 nosy: nagle priority: normal severity: normal status: open title: Re: SSLsocket.getpeercert - return ALL the fields of the certificate. versions: Python 3.4 ___ Python tracker <http://bugs.python.org/issue22873> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18907] urllib2.open FTP open times out at 20 secs despite timeout parameter
John Nagle added the comment: The server operator at the US Securities and Exchange Commission writes to me: "There was a DNS issue that affected the availability of FTP at night. We believe it is resolved. Please let us know if you encounter any further problems. Thanks, SEC Webmaster". So this may have been a DNS related issue, perhaps a load balancer referring the connection to a dead machine. Yet, for some reason, the Windows command line FTP client can recover from this problem after 20 seconds? What are they doing right? Completely retrying the open? -- status: pending -> open ___ Python tracker <http://bugs.python.org/issue18907> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18907] urllib2.open FTP open times out at 20 secs despite timeout parameter
John Nagle added the comment: Reproduced problem in Python 3.3 (Win32). Error message there is: Open of ftp://ftp.sec.gov/edgar/daily-index failed after 21.08 seconds: So this is broken in both Python 2.7 and Python 3.3. -- versions: +Python 3.3 Added file: http://bugs.python.org/file31559/edgartimeouttest3.py ___ Python tracker <http://bugs.python.org/issue18907> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18907] urllib2.open FTP open times out at 20 secs despite timeout parameter
New submission from John Nagle: urllib2.open for an FTP url does not obey the timeout parameter. Attached test program times out on FTP open after 21 seconds, even though the specified timeout is 60 seconds. Timing is consistent; times have ranged from 21.03 to 21.05 seconds. Python documentation (http://docs.python.org/2/library/urllib2.html) says "The optional timeout parameter specifies a timeout in seconds for blocking operations like the connection attempt (if not specified, the global default timeout setting will be used). This actually only works for HTTP, HTTPS and FTP connections." The documentation for Python 3 reads the same. Open of ftp://ftp.sec.gov/edgar/daily-index failed after 21.05 seconds: This was on Windows 7, but the same result is observed on Linux. (The FTP server at the U.S. Securities and Exchange Commission is now imposing a 20-second connection delay during busy periods. This is causing our Python code that retrieves new SEC filings to fail. It may be necessary to run the program at different times of the day to reproduce the problem.) -- components: Library (Lib) files: edgartimeouttest.py messages: 196800 nosy: nagle priority: normal severity: normal status: open title: urllib2.open FTP open times out at 20 secs despite timeout parameter versions: Python 2.7 Added file: http://bugs.python.org/file31558/edgartimeouttest.py ___ Python tracker <http://bugs.python.org/issue18907> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15873] "datetime" cannot parse ISO 8601 dates and times
John Nagle added the comment: For what parts of ISO 8601 to accept, there's a standard: RFC3339, "Date and Time on the Internet: Timestamps". See section 5.6: date-fullyear = 4DIGIT date-month = 2DIGIT ; 01-12 date-mday = 2DIGIT ; 01-28, 01-29, 01-30, 01-31 based on ; month/year time-hour = 2DIGIT ; 00-23 time-minute = 2DIGIT ; 00-59 time-second = 2DIGIT ; 00-58, 00-59, 00-60 based on leap second ; rules time-secfrac= "." 1*DIGIT time-numoffset = ("+" / "-") time-hour ":" time-minute time-offset = "Z" / time-numoffset partial-time= time-hour ":" time-minute ":" time-second [time-secfrac] full-date = date-fullyear "-" date-month "-" date-mday full-time = partial-time time-offset date-time = full-date "T" full-time NOTE: Per [ABNF] and ISO8601, the "T" and "Z" characters in this syntax may alternatively be lower case "t" or "z" respectively. ISO 8601 defines date and time separated by "T". Applications using this syntax may choose, for the sake of readability, to specify a full-date and full-time separated by (say) a space character. That's straightforward, and can be expressed as a regular expression. -- ___ Python tracker <http://bugs.python.org/issue15873> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15873] "datetime" cannot parse ISO 8601 dates and times
John Nagle added the comment: Re: "%z format is supported". That's platform-specific; the actual parsing is delegated to the C library. It's not in Python 2.7 / Win32: ValueError: 'z' is a bad directive in format '%Y-%m-%dT%H:%M:%S%z' It really shouldn't be platform-specific; the underlying platform is irrelevant to this task. That's more of a documentation error; the features not common to all supported Python platforms should not be mentioned in the documentation. Re: "I would very much like such promiscuous parser to be implemented in datetime.__new__. " For string input, it's probably better to do this conversion in a specific class-level function. Full ISO 8601 dates/times generally come from computer-generated data via a file or API. If invalid text shows up, it should be detected as an error, not be heuristically interpreted as a date. There's already "fromtimestamp" and "fromordinal", and "isoformat" as an instance method, so "fromisoformat" seems reasonable. I'd also suggest providing a standard subclass of tzinfo in datetime for fixed offsets. That's needed to express the time zone information in an ISO 8601 date. The new "fromisoformat" would convert an ISO 8601 date/time would be convertible to a time-zone "aware" datetime object. If converted back to an ISO 8601 string with .isoformat(), the round trip should preserve the original data, including time zone offset. (Several more implementations of this conversion have turned up. In addition to the four already mentioned, there was one in xml.util, and one in feedparser. There are probably more yet to be found.) -- ___ Python tracker <http://bugs.python.org/issue15873> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15873] "datetime" cannot parse ISO 8601 dates and times
New submission from John Nagle: The datetime module has support for output to a string of dates and times in ISO 8601 format ("2012-09-09T18:00:00-07:00"), with the object method "isoformat([sep])". But there's no support for parsing such strings. A string to datetime class method should be provided, one capable of parsing at least the RFC 3339 subset of ISO 8601. The problem is parsing time zone information correctly. The allowed formats for time zone are empty - no TZ, date/time is "naive" in the datetime sense Z - zero, or Zulu time, i.e. UTC. [+-]nn.nn - offset from UTC "strptime" does not understand timezone offsets. The "datetime" documentation suggests that the "z" format directive handles time zone info, but that's not actually implemented for input. Pypi has four modules for parsing ISO 8601 dates. Each has least one major problem in time zone handling: iso8601 0.1.4 Abandonware. Mishandles time zone when time zone is "Z" and the default time zone is specified. iso8601.py 0.1dev Always returns a "naive" datetime object, even if zone specified. iso8601plus 0.1.6 Fork of abandonware version above. Same bug. zc.iso8601 0.2.0 Zope version. Imports the pytz module with the full Olsen time zone database, but doesn't actually use that database. Thus, nothing in Pypi provides a good alternative. It would be appropriate to handle this in the datetime module. One small, correct, tested function would be better than the existing five bad alternatives. -- components: Library (Lib) messages: 169941 nosy: nagle priority: normal severity: normal status: open title: "datetime" cannot parse ISO 8601 dates and times type: enhancement versions: Python 2.6, Python 2.7, Python 3.1, Python 3.2, Python 3.3, Python 3.4 ___ Python tracker <http://bugs.python.org/issue15873> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue9679] unicode DNS names in urllib, urlopen
John Nagle added the comment: The current convention is that domains go into DNS lookup as punycode, and the port, query, and fragment fields of the URL are encoded with percent-escapes. See http://lists.w3.org/Archives/Public/ietf-http-wg/2011OctDec/0155.html Python needs to get with the program here. -- ___ Python tracker <http://bugs.python.org/issue9679> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue9679] unicode DNS names in urllib, urlopen
John Nagle added the comment: A "IRI library" is not needed to fix this problem. It's already fixed in the sockets library and the http library. We just need consistency in urllib2. urllib2 functions which take a "url" parameter should apply "encodings.idna.ToASCII" to each label of the domain name. urllib2 function which return a "url" value (such as "geturl()") should apply "encodings.idna.ToUnicode" to each label of the domain name. Note that in both cases, the conversion function must be applied to each label (field between "."s) of the domain name only. Applying it to the entire domain name or the entire URL will not work. If there are future changes to domain syntax, those should go into "encodings.idna", which is the proper library for domain syntax issues. -- nosy: +nagle ___ Python tracker <http://bugs.python.org/issue9679> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11900] 2.7.1 unicode subclasses not calling __str__() for print statement
John Nagle added the comment: This has nothing to do with Python 3. There's a difference in __str__ handling between Python 2.6 and Python 2.7.2. It's enough to crash BeautifulSoup: [Thread-8] Unexpected EXCEPTION while processing page "http://www.verisign.com": global name '__str__' is not defined [Thread-8] Traceback (most recent call last): ... [Thread-8] File "C:\projects\sitetruth\BeautifulSoup.py", line 646, in prettify [Thread-8] return self.__str__(encoding, True) [Thread-8] File "C:\projects\sitetruth\BeautifulSoup.py", line 621, in __str__ [Thread-8] contents = self.renderContents(encoding, prettyPrint, indentContents) [Thread-8] File "C:\projects\sitetruth\BeautifulSoup.py", line 656, in renderContents [Thread-8] text = c.__str__(encoding) [Thread-8] File "C:\projects\sitetruth\BeautifulSoup.py", line 415, in __str__ [Thread-8] return "" % NavigableString.__str__(self, encoding) [Thread-8] File "C:\projects\sitetruth\BeautifulSoup.py", line 393, in __unicode__ [Thread-8] return __str__(self, None) [Thread-8] NameError: global name '__str__' is not defined The class method that's failing is simply class NavigableString(unicode, PageElement): ... def __unicode__(self): return __str__(self, None) EXCEPTION RAISED HERE def __str__(self, encoding=DEFAULT_OUTPUT_ENCODING): if encoding: return self.encode(encoding) else: return self Using __str__ in the global namespace is probably wrong, and in a later version of BeautifulSoup, that code is changed to def __unicode__(self): return str(self).decode(DEFAULT_OUTPUT_ENCODING) which seems to work. However, it is a real change from 2.6 to 2.7 that breaks code. -- nosy: +nagle ___ Python tracker <http://bugs.python.org/issue11900> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13288] SSL module doesn't allow access to cert issuer information
New submission from John Nagle : The SSL module still doesn't return much information from the certificate. SSLSocket.getpeercert only returns a few basic items about the certificate subject. You can't retrieve issuer information, and you can't get the extensions needed to check if a cert is an EV cert. With the latest flaps about phony cert issuers, (another CA compromise hit the news today) it's worth having issuer info available. It was available in the old M2Crypto module, but not in the current Python SSL module. John Nagle -- components: Library (Lib) messages: 146579 nosy: nagle priority: normal severity: normal status: open title: SSL module doesn't allow access to cert issuer information versions: Python 2.6, Python 2.7, Python 3.1, Python 3.2, Python 3.3, Python 3.4 ___ Python tracker <http://bugs.python.org/issue13288> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10202] ftplib doesn't check close status after sending file
John Nagle added the comment: Proper behavior for ftplib when sending is to send all desired data, then call "sock.shutdown(socket.SHUT_RDWR)". This indicates that no more data will be sent, and blocks until the receiver has acknowledged all their data. "socketmodule.c" handles this right. "shutdown" is called on the socket, and the return value is checked. If the return value is negative, an error handler is returned. Compare the handling in "close". FTP send is one of the few situations where this matters, because FTP uses the close of the data connection to indicate EOF. -- ___ Python tracker <http://bugs.python.org/issue10202> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10202] ftplib doesn't check close status after sending file
New submission from John Nagle : "ftplib" doesn't check the status on socket close after writing. This can lead to silently truncated files when sending files with "ftplib". A report of truncated files on comp.lang.python led me to check the source code. The "ftplib" module does sending by calling sock_sendall in "socketmodule.c". That does an OS-level "send", and once everything has been sent, returns. But an OS-level socket send returns when the data is queued for sending, not when it is delivered. Only when the socket is closed, and the close status checked, do you know if the data was delivered. There's a final TCP close handshake that occurs when close has been called at both ends, and only when it completes successfully do you know that the data has been delivered. At the socket level, this is performed by "shutdown" (which closes the connection and returns the proper network status information), or by "close". Look at sock_close in "socketmodule.c". Note that it ignores the return status on close, always returns None, and never raises an exception. As the Linux manual page for "close" says: "Not checking the return value of close() is a common but nevertheless serious programming error. It is quite possible that errors on a previous write(2) operation are first reported at the final close(). Not checking the return value when closing the file may lead to silent loss of data." "ftplib", in "storlines" and "storbinary", calls "close" without checking the status or calling "shutdown" first. So if the other end disconnects after all data has been queued locally but not sent, received and acknowledged, the sender will never know. -- components: Library (Lib) messages: 119638 nosy: nagle priority: normal severity: normal status: open title: ftplib doesn't check close status after sending file type: behavior versions: Python 2.5, Python 2.6, Python 2.7, Python 3.1, Python 3.2, Python 3.3 ___ Python tracker <http://bugs.python.org/issue10202> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7558] Python 3.1.1 installer botches upgrade when installation is not on C drive.
John Nagle added the comment: Cancel bug report. It was my error. The installer says it is replacing the existing installation, but by default installs it in C:. But that can be overridden in the directory entry field below the big empty white entry box. -- ___ Python tracker <http://bugs.python.org/issue7558> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7558] Python 3.1.1 installer botches upgrade when installation is not on C drive.
New submission from John Nagle : I just installed "python3.1.1.msi" on a system that had "python3.1.msi" installed in "D:/python31". In this situation, the installer does not ask the user for a destination directory. The installer found the old installation in "D:/python31", removed most but not all of the files there, and then installed the new version in "C:/python31". I uninstalled the failed install, and reinstalled. On a new install, the installer prompts for the destination dir, and that works. Upgrade installs, though, are botched. John Nagle -- components: Installation messages: 96768 nosy: nagle severity: normal status: open title: Python 3.1.1 installer botches upgrade when installation is not on C drive. type: behavior versions: Python 3.1 ___ Python tracker <http://bugs.python.org/issue7558> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1712522] urllib.quote throws exception on Unicode URL
John Nagle added the comment: Note that the problem can't be solved by telling end users to call a different "quote" function. The problem is down inside a library module. "robotparser" is calling "urllib.quote". One of those two library modules needs to be fixed. -- ___ Python tracker <http://bugs.python.org/issue1712522> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1637] urlparse.urlparse misparses URLs with query but no path
John Nagle added the comment: I tried downloading the latest rev of urlparse.py (59480) and it flunked its own unit test, "urlparse.test()" Two test cases fail. So I don't want to try to fix the module until the last people to change it fix their unit test problems. The fix I provided should fix the problem I reported, but I'm not sure if there's anything else wrong, since it flunks its unit test. __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1637> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1637] urlparse.urlparse misparses URLs with query but no path
New submission from John Nagle: urlparse.urlparse will mis-parse URLs which have a "/" after a "?". >> >> sa1 = 'http://example.com?blahblah=/foo' >> sa2 = 'http://example.com?blahblah=foo' >> print urlparse.urlparse(sa1) >> ('http', 'example.com?blahblah=', '/foo', '', '', '') # WRONG >> print urlparse.urlparse(sa2) >> ('http', 'example.com', '', '', 'blahblah=foo', '') # RIGHT That's wrong. RFC3896 ("Uniform Resource Identifier (URI): Generic Syntax"), page 23 says "The characters slash ("/") and question mark ("?") may represent data within the query component. Beware that some older, erroneous implementations may not handle such data correctly when it is used as the base URI for relative references (Section 5.1), apparently because they fail to distinguish query data from path data when looking for hierarchical separators." So "urlparse" is an "older, erroneous implementation". Looking at the code for "urlparse", it references RFC1808 (1995), which was a long time ago, three revisions back. >> >> Here's the bad code: >> >> def _splitnetloc(url, start=0): >> for c in '/?#': # the order is important! >> delim = url.find(c, start) >> if delim >= 0: >> break >> else: >> delim = len(url) >> return url[start:delim], url[delim:] >> >> That's just wrong. The domain ends at the first appearance of >> any character in '/?#', but that code returns the text before the >> first '/' even if there's an earlier '?'. A URL/URI doesn't >> have to have a path, even when it has query parameters. OK, here's a fix to "urlparse", replacing _splitnetloc. I didn't use a regular expression because "urlparse" doesn't import "re", and I didn't want to change that. def _splitnetloc(url, start=0): delim = len(url)# position of end of domain part of url, default is end for c in '/?#':# look for delimiters; the order is NOT important wdelim = url.find(c, start)# find first of this delim if wdelim >= 0:# if found delim = min(delim, wdelim)# use earliest delim position return url[start:delim], url[delim:]# return (domain, rest) __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1637> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1637] urlparse.urlparse misparses URLs with query but no path
Changes by John Nagle: -- components: Library (Lib) nosy: nagle severity: normal status: open title: urlparse.urlparse misparses URLs with query but no path type: behavior versions: Python 2.4, Python 2.5 __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1637> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com