Yeah test_tokenize is weird, I've been looking into it as well. Here's a sample failure from a Windows buildbot:
File "S:\buildbots\python\3.0.nelson-windows\build\lib\test\test_tokenize.py", line ?, in test.test_tokenize.__test__.doctests Failed example: for testfile in testfiles: if not roundtrip(open(testfile)): break else: True Exception raised: Traceback (most recent call last): File "S:\buildbots\python\3.0.nelson-windows\build\lib\doctest.py", line 1227, in __run compileflags, 1), test.globs) File "<doctest test.test_tokenize.__test__.doctests[56]>", line 2, in <module> if not roundtrip(open(testfile)): break File "<doctest test.test_tokenize.__test__.doctests[5]>", line 3, in roundtrip token_list = list(generate_tokens(f.readline)) File "S:\buildbots\python\3.0.nelson-windows\build\lib\tokenize.py", line 264, in generate_tokens line = readline() File "S:\buildbots\python\3.0.nelson-windows\build\lib\io.py", line 1467, in readline readahead, pending = self._read_chunk() File "S:\buildbots\python\3.0.nelson-windows\build\lib\io.py", line 1278, in _read_chunk pending = self._decoder.decode(readahead, not readahead) File "S:\buildbots\python\3.0.nelson-windows\build\lib\io.py", line 1081, in decode output = self.decoder.decode(input, final=final) File "S:\buildbots\python\3.0.nelson-windows\build\lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 17: character maps to <undefined> The following is at the end of the doctests in test_tokenize: >>> tempdir = os.path.dirname(f) or os.curdir >>> testfiles = glob.glob(os.path.join(tempdir, "test*.py")) >>> if not test_support.is_resource_enabled("compiler"): ... testfiles = random.sample(testfiles, 10) ... >>> for testfile in testfiles: ... if not roundtrip(open(testfile)): break ... else: True True On that first line, 'f' is lib/test/tokenize_tests.txt, so basically, it's grabbing ten random test*.py files in lib/test and running untokenize(generate_tokens(f.readline)) on each one. In order to figure out which file it's dying on, I added the following to test_tokenize.py: def test_tokenize_all(): import glob import os tempdir = os.path.dirname(__file__) or os.curdir testfiles = glob.glob(os.path.join(tempdir, "test*.py")) for file in testfiles: print("processing file: " + file) print("roundtrip(open(file)): " + roundtrip(open(file))) This results in different results: Python 3.0a3+ (py3k, Mar 16 2008, 10:41:45) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> from test import test_tokenize [50808 refs] >>> test_tokenize.test_tokenize_all() processing file: s:\src\svn\svn.python.org\projects\python\branches\py3k\lib\test\testcodec.py Traceback (most recent call last): File "<stdin>", line 1, in <module> File "s:\src\svn\svn.python.org\projects\python\branches\py3k\lib\test\test_tokenize.py", line 565, in test_tokenize_all print("roundtrip(open(file)): " + roundtrip(open(file))) File "s:\src\svn\svn.python.org\projects\python\branches\py3k\lib\test\test_tokenize.py", line 514, in roundtrip source = untokenize(generate_tokens(f.readline)) File "s:\src\svn\svn.python.org\projects\python\branches\py3k\lib\tokenize.py", line 238, in untokenize return ut.untokenize(iterable) File "s:\src\svn\svn.python.org\projects\python\branches\py3k\lib\tokenize.py", line 183, in untokenize self.add_whitespace(start) File "s:\src\svn\svn.python.org\projects\python\branches\py3k\lib\tokenize.py", line 172, in add_whitespace assert row <= self.prev_row AssertionError [52668 refs] Yay. And to make this even more interesting: s:\src\svn\svn.python.org\projects\python\branches\py3k\PCbuild>python_d ..\Lib\test\test_tokenize.py doctest (test.test_tokenize) ... 62 tests with zero failures [61919 refs] Oh, and while we're here: s:\src\svn\svn.python.org\projects\python\branches\py3k\PCbuild>python_d ..\lib\test\regrtest.py -q -uall -rw test_tokenize ********************************************************************** File "s:\src\svn\svn.python.org\projects\python\branches\py3k\lib\test\test_tokenize.py", line ?, in test.test_tokenize.__test__.doc tests Failed example: for testfile in testfiles: if not roundtrip(open(testfile)): break else: True Exception raised: Traceback (most recent call last): File "s:\src\svn\svn.python.org\projects\python\branches\py3k\lib\doctest.py", line 1227, in __run compileflags, 1), test.globs) File "<doctest test.test_tokenize.__test__.doctests[56]>", line 2, in <module> if not roundtrip(open(testfile)): break File "<doctest test.test_tokenize.__test__.doctests[5]>", line 3, in roundtrip token_list = list(generate_tokens(f.readline)) File "s:\src\svn\svn.python.org\projects\python\branches\py3k\lib\tokenize.py", line 264, in generate_tokens line = readline() File "s:\src\svn\svn.python.org\projects\python\branches\py3k\lib\io.py", line 1467, in readline readahead, pending = self._read_chunk() File "s:\src\svn\svn.python.org\projects\python\branches\py3k\lib\io.py", line 1278, in _read_chunk pending = self._decoder.decode(readahead, not readahead) File "s:\src\svn\svn.python.org\projects\python\branches\py3k\lib\io.py", line 1081, in decode output = self.decoder.decode(input, final=final) File "s:\src\svn\svn.python.org\projects\python\branches\py3k\lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 17: character maps to <undefined> ********************************************************************** 1 items had failures: 1 of 57 in test.test_tokenize.__test__.doctests ***Test Failed*** 1 failures. test test_tokenize failed -- 1 of 62 doctests failed 1 test failed: test_tokenize Trent. ________________________________________ From: [EMAIL PROTECTED] [EMAIL PROTECTED] On Behalf Of Mark Dickinson [EMAIL PROTECTED] Sent: 16 March 2008 18:13 To: Python Dev Subject: Re: [Python-Dev] 3.0 buildbots all red On Sun, Mar 16, 2008 at 1:32 PM, Neal Norwitz <[EMAIL PROTECTED]> wrote: > > I think this is possible, though considerable work. Probably the > biggest win will be creating a mock for socket and using mock sockets > in the tests for asyn{core,chat}, smtplib, xmlrpc, etc. That will fix > about 75% of the problems on 2.6. The remaining problems are: > > * test_asyn{chat,core} might not be meaningful with mock sockets and are > flaky > * the alpha fails test_signal/socket for weird alarm conditions. > this might be hard to debug/fix (I have access to this box though) > * test_sqlite is broken on x86 with an old sqlite (I have access to this box) > * test_bsddb may be flaky, I'm not sure > * probably a few platform specific problems > test_tokenize is also currently (sometimes) failing on many of the bots. I've been looking into it, but I'm struggling to find the problem. The traceback e.g. for the amd64 gentoo buildbot ends with File "/home/buildbot/slave/py-build/3.0.norwitz-amd64/build/Lib/io.py", line 1081, in decode output = self.decoder.decode(input, final=final) File "/home/buildbot/slave/py-build/3.0.norwitz-amd64/build/Lib/codecs.py", line 291, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf8' codec can't decode bytes in position 12-15: invalid data On my own machine (SuSE 9.3/i686) I'm seeing this test pass about 80% of the time and fail the other 20% with something like the above, the position of the reported invalid data changing from run to run. It looks like data are getting corrupted somewhere along the line. Anyone have any ideas? Mark _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/tnelson%40onresolve.com _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com