[issue719888] tokenize module w/ coding cookie

2008-04-22 Thread Trent Nelson
Trent Nelson <[EMAIL PROTECTED]> added the comment: This was fixed in trunk in r61573, and merged to py3k in r61982. -- status: open -> closed Tracker <[EMAIL PROTECTED]> _

[issue719888] tokenize module w/ coding cookie

2008-03-18 Thread Martin v. Löwis
Martin v. Löwis <[EMAIL PROTECTED]> added the comment: > Is it worth keeping generate_tokens as an alias for tokenize, just > to avoid gratuitous 2-to-3 breakage? Maybe not---I guess they're > different beasts, in that one wants a string-valued iterator and the > other wants a bytes-valued iter

[issue719888] tokenize module w/ coding cookie

2008-03-18 Thread Mark Dickinson
Mark Dickinson <[EMAIL PROTECTED]> added the comment: All tests pass for me on OS X 10.5.2 and SuSE Linux 10.2 (32-bit)! Tracker <[EMAIL PROTECTED]>

[issue719888] tokenize module w/ coding cookie

2008-03-18 Thread Michael Foord
Michael Foord <[EMAIL PROTECTED]> added the comment: *Full* patch (excluding the new dependent test text files) for Python 3. Includes fixes for standard library and tools usage of tokenize. If it breaks anything blame Trent... ;-) -- versions: -Python 2.6 Added file: http://bugs.pytho

[issue719888] tokenize module w/ coding cookie

2008-03-18 Thread Michael Foord
Michael Foord <[EMAIL PROTECTED]> added the comment: If you remove the following line from the tests (which generates spurious additional output on stdout) then the problem goes away: print('testing: %s' % path, end='\n') Tracker <[EMAIL PROTECTED]>

[issue719888] tokenize module w/ coding cookie

2008-03-18 Thread Mark Dickinson
Mark Dickinson <[EMAIL PROTECTED]> added the comment: With the patch, ./python.exe Lib/test/regrtest.py test_tokenize fails for me with the following output: Macintosh-2:py3k dickinsm$ ./python.exe Lib/test/regrtest.py test_tokenize test_tokenize test test_tokenize produced unexpected output:

[issue719888] tokenize module w/ coding cookie

2008-03-18 Thread Trent Nelson
Trent Nelson <[EMAIL PROTECTED]> added the comment: Tested patch on Win x86/x64 2k8, XP & FreeBSD 6.2, +1. -- assignee: -> Trent.Nelson keywords: +patch Tracker <[EMAIL PROTECTED]> ___

[issue719888] tokenize module w/ coding cookie

2008-03-18 Thread Mark Dickinson
Mark Dickinson <[EMAIL PROTECTED]> added the comment: Sorry---ignore the last comment; if readline() doesn't supply bytes then the line.decode('ascii') will fail with an AttributeError. So there won't be silent failure. I'll try thinking first and posting later next time.

[issue719888] tokenize module w/ coding cookie

2008-03-18 Thread Mark Dickinson
Mark Dickinson <[EMAIL PROTECTED]> added the comment: Is it worth keeping generate_tokens as an alias for tokenize, just to avoid gratuitous 2-to-3 breakage? Maybe not---I guess they're different beasts, in that one wants a string-valued iterator and the other wants a bytes-valued iterator. So

[issue719888] tokenize module w/ coding cookie

2008-03-18 Thread Michael Foord
Michael Foord <[EMAIL PROTECTED]> added the comment: That was 'by discussion with wiser heads than I'. The existing module has an old backwards compatibility interface called 'tokenize'. That can be deprecated in 2.6. As 'tokenize' is really the ideal name for the main entry point for the module

[issue719888] tokenize module w/ coding cookie

2008-03-18 Thread Mark Dickinson
Mark Dickinson <[EMAIL PROTECTED]> added the comment: Michael, is the disappearance of the generate_tokens function in the new version of tokenize.py intentional? Tracker <[EMAIL PROTECTED]> __

[issue719888] tokenize module w/ coding cookie

2008-03-18 Thread Michael Foord
Michael Foord <[EMAIL PROTECTED]> added the comment: Made quite extensive changes to tokenize.py (with tests) for Py3k. This migrates it to a 'bytes' API so that it can correctly decode Python source files following PEP-0263. -- nosy: +fuzzyman Added file: http://bugs.python.org/file9735

[issue719888] tokenize module w/ coding cookie

2008-03-16 Thread Trent Nelson
Trent Nelson <[EMAIL PROTECTED]> added the comment: Hmm, I take it multiple file uploads aren't supported. I don't want to use svn diff for the text files as it looks like it's butchering the bom encodings, so, tar it is! (Untar in root py3k/ directory.) Added file: http://bugs.python.org/fi

[issue719888] tokenize module w/ coding cookie

2008-03-16 Thread Trent Nelson
Trent Nelson <[EMAIL PROTECTED]> added the comment: I've attached a patch to test_tokenizer.py and a bunch of text files (that should be dropped into Lib/test) that highlight this issue a *lot* better than the current state of affairs. The existing implementation defines roundup() in the docte

[issue719888] tokenize module w/ coding cookie

2008-03-16 Thread Martin v. Löwis
Martin v. Löwis <[EMAIL PROTECTED]> added the comment: In 3k, the tokenize module should definitely return strings, and, in doing so, it should definitely consider the encoding declaration (and also the default encoding in absence of the encoding declaration). For 2.6, I wouldn't mind if it we

[issue719888] tokenize module w/ coding cookie

2008-03-16 Thread Mark Dickinson
Mark Dickinson <[EMAIL PROTECTED]> added the comment: This issue is currently causing test_tokenize failures in Python 3.0. There are other ways to fix the test failures, but making tokenize honor the source file encoding seems like the right thing to do to me. Does this still seem like a good