[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Martin v . Löwis
Martin v. Löwis added the comment: > Are you serious? This sounds like a py4k idea. Can you give us a > hint on what the new representation will be? I'm thinking about an approach of a variable representation: one, two, or four bytes, depending on the widest character that appears in the stri

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-12-29 Thread Gregory P. Smith
Gregory P. Smith added the comment: As belopolsky said... *please* move this development into version control. Put it up in an Hg repo on code.google.com. or put it on github. *anything* other than repeatedly posting entire zip file source code drops to a bugtracker. -- __

[issue10793] hashlib.hash.digest() documentation incorrect re return type

2010-12-29 Thread Senthil Kumaran
Senthil Kumaran added the comment: Fixed in r87573 and r87574 -- nosy: +orsenthil resolution: -> fixed stage: -> committed/rejected status: open -> closed type: -> behavior ___ Python tracker __

[issue9893] Usefulness of the Misc/Vim/ files?

2010-12-29 Thread Senthil Kumaran
Senthil Kumaran added the comment: On Mon, Dec 27, 2010 at 07:59:46PM +, Brett Cannon wrote: > But if you have a local copy of the Vim files from the community > what is preventing you from editing them for new keywords and > sending a patch to the maintainer so that the rest of the communi

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-12-29 Thread Jacques Grove
Jacques Grove added the comment: Another one that diverges between stock regex and issue2636-20101229.zip: re.search('A\s*?.*?(\n+.*?\s*?){0,2}\(X', 'A\n1\nS\n1 (X') -- ___ Python tracker <http://bu

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-12-29 Thread Jacques Grove
Jacques Grove added the comment: re.search('\d{4}(\s*\w)?\W*((?!\d)\w){2}', "XX") matches on stock 2.6.5 regex module, but not on issue2636-20101230.zip or issue2636-20101229.zip (which I've fallen back to for now) --

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-12-29 Thread Jacques Grove
Jacques Grove added the comment: Yeah, issue2636-20101230.zip DOES reduce memory usage significantly (30-50%) in my use cases; however, it also tanks performance overall by 35% for me, so I'll prefer to stick with issue2636-20101229.zip (or some variant of it). Maybe a regex compile

[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: On Wed, Dec 29, 2010 at 9:38 PM, Alexander Belopolsky wrote: .. > Given that until recently (r87433) the PEP and the reference manual > disagreed on the definition, Actually, it looks like PEP 3131 and the Language Reference [1] still disagree. The latt

[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: On Wed, Dec 29, 2010 at 8:02 PM, Martin v. Löwis wrote: .. > > I plan to propose a complete redesign of the representation of Unicode > strings, which may well make this entire set of changes obsolete. > Are you serious? This sounds like a py4k idea. C

[issue8821] Range check on unicode repr

2010-12-29 Thread Matt Giuca
Matt Giuca added the comment: > I think that we have good reasons to not remove the NUL character. Please note: Nobody is suggesting that we remove the NUL character. I was merely suggesting that we don't rely on it where it is unnecessary. Returning to my original patch: If the code was usin

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-12-29 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20101230.zip is a new version of the regex module. I've delayed the building of the tables for fast searching until their first use, which, hopefully, will mean that fewer will be actually built. -- Added file: http://bugs.python.org/file20

[issue3232] Wrong str->bytes conversion in Lib/encodings/idna.py

2010-12-29 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: Arguably, it is not a bug if codec's decode method rejects unicode strings with a TypeError. The 2.x implementation seems to allow decoding of ASCII-only unicode labels joined by arbitrary RFC 3490 separators. I am not sure what the use case for this

[issue10795] standard library do not use ssl as recommended

2010-12-29 Thread Mads Kiilerich
New submission from Mads Kiilerich : As discussed on issue1589 it is now possible to create decent ssl connections with the ssl module - assuming ca_certs is specified and it is checked that the certificates matches. The standard library do however neither do that nor make it possible to do it

[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Martin v . Löwis
Martin v. Löwis added the comment: >> Seriously, it can wait 3.3. > > What exactly can wait until 3.3? The presented patch introduces no > user visible changes. It is only a stepping stone to restoring some > sanity in a way supplementary characters are treated by narrow builds. > At the mom

[issue10794] Infinite recursion while garbage collecting loops indefinitely

2010-12-29 Thread Mihai Rusu
New submission from Mihai Rusu : Hi While working on some Python code I stumbled on a situation where the Python process seems to hang indefinitely. Further debugging points to the following conclusion: if there is a class that somehow manages to run into an infinite recursion (properly detec

[issue3232] Wrong str->bytes conversion in Lib/encodings/idna.py

2010-12-29 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: Martin's original code (r32301) was pretty clear: 32301 loewis # IDNA allows decoding to operate on Unicode strings, too. 32301 loewis if isinstance(input, unicode): 32301 loewis labels = dots.split(input) 3230

[issue8821] Range check on unicode repr

2010-12-29 Thread STINNER Victor
STINNER Victor added the comment: > Unicode objects are NUL-terminated, but only very external APIs > rely on this (e.g. code using the Windows Unicode API). All Py_UNICODE_str*() functions rely on the NUL character. They are useful when patching a function from bytes (char*) to unicode (PyUni

[issue10793] hashlib.hash.digest() documentation incorrect re return type

2010-12-29 Thread SilentGhost
SilentGhost added the comment: One-word patch attached. -- keywords: +patch nosy: +SilentGhost Added file: http://bugs.python.org/file20191/hashlib.rst.diff ___ Python tracker _

[issue10793] hashlib.hash.digest() documentation incorrect re return type

2010-12-29 Thread Thorsten Behrens
New submission from Thorsten Behrens : The documentation for hashlib.hash.digest() states that digest() will "[r]eturn the digest of the data passed to the update() method so far. This is a bytes array of size digest_size[...]". The returned object is of class 'bytes', not 'bytearray'. Documen

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-12-29 Thread Jacques Grove
Jacques Grove added the comment: More an observation than a bug: I understand that we're trading memory for performance, but I've noticed that the peak memory usage is rather high, e.g.: $ cat test.py import os import regex as re def resident(): for line in open('/proc/%d/status' % os.ge

[issue8821] Range check on unicode repr

2010-12-29 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: [MAL] > * Unicode objects are NUL-terminated, but only very external APIs >rely on this (e.g. code using the Windows Unicode API). Please >don't make the code in unicodeobject.c itself rely on this >subtle detail. I would like to note that se

[issue1674555] sys.path in tests contains system directories

2010-12-29 Thread R. David Murray
R. David Murray added the comment: One way to "fix" this would be to have make test run the tests with -j1 and pass in the -S and -s flags, and then have regrtest special case test_site and remove those flags for the run of that single test. An interesting facet of this proposal in that it ac

[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

2010-12-29 Thread Alexander Belopolsky
Changes by Alexander Belopolsky : -- nosy: +belopolsky ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://ma

[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread David Beazley
David Beazley added the comment: Hmmm. Interesting. In the big picture, it might be an interesting project for someone (not necessarily the core devs) to sit down and refactor both of these modules so that they play nice with Python 3 I/O system. Obviously that's a project outside the scope

[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread Antoine Pitrou
Antoine Pitrou added the comment: > Do Python devs really view gzip and bz2 as two totally completely > different animals? They both have the same functionality and would be > used for the same kinds of things. Maybe I'm missing something. Well, the reality of divergent implementation strate

[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread David Beazley
David Beazley added the comment: Do Python devs really view gzip and bz2 as two totally completely different animals? They both have the same functionality and would be used for the same kinds of things. Maybe I'm missing something. -- ___ Pytho

[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread Antoine Pitrou
Antoine Pitrou added the comment: > C or not, wrapping a BZ2File instance with a TextIOWrapper to get text > still seems like something that someone might want to do. I doubt it > would take much modification to give BZ2File instances the required > set of methods. BZ2File uses FILE pointers i

[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread R. David Murray
R. David Murray added the comment: Right, but in the bz2 case I think it is a feature request rather than a bugfix. In any case it should be a separate issue. -- ___ Python tracker __

[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: On Wed, Dec 29, 2010 at 3:36 PM, STINNER Victor wrote: .. > Use non-ASCII identifiers is exotic. Use non-BMP identifiers is > crazy :-) Hmm, we clearly disagree on what crosses the boundary of the mental norm. IMHO, it is crazy to require users to care

[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread David Beazley
David Beazley added the comment: C or not, wrapping a BZ2File instance with a TextIOWrapper to get text still seems like something that someone might want to do. I doubt it would take much modification to give BZ2File instances the required set of methods. -- ___

[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread Antoine Pitrou
Antoine Pitrou added the comment: > While you're at it, maybe someone could add an 'open' function to bz2 > to make it symmetrical with gzip as well :-). That's a nice idea, but quite orthogonal to this issue. -- ___ Python tracker

[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread R. David Murray
R. David Murray added the comment: bz2 is a pure C module, so that's a very different situation. -- ___ Python tracker ___ ___ Python

[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread David Beazley
David Beazley added the comment: It goes without saying that this also needs to be checked with the bz2 module. A quick check seems to indicate that it has the same problem. While you're at it, maybe someone could add an 'open' function to bz2 to make it symmetrical with gzip as well :-). --

[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread Antoine Pitrou
Antoine Pitrou added the comment: This should be easy to fix, if only the "readable" and "writable" methods are needed. Do you want to try writing a patch? -- ___ Python tracker __

[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread R. David Murray
R. David Murray added the comment: Heh, and 2.7. Fixing versions yet again. -- versions: +Python 2.7 ___ Python tracker ___ ___ Pyth

[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread R. David Murray
R. David Murray added the comment: Oops. It only has that inheritance in 3.2. -- versions: -Python 2.7, Python 3.1 ___ Python tracker ___ _

[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread R. David Murray
R. David Murray added the comment: Since GZipFile inherits from BufferedIOBase, and TextIOWrapper is supposed to be designed to wrap a BufferedIOBase object, I would say yes it ought to work. On the other hand there may also be a doc error there, since it may be that TextIOWrapper actually n

[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread STINNER Victor
STINNER Victor added the comment: Le mercredi 29 décembre 2010 à 19:26 +, Alexander Belopolsky a écrit : > Would it look as exotic if presented like this? > > File "", line 1 > 𐌀 = 5 >^ > SyntaxError: invalid character in identifier > (works on a wide build) Use non-ASCII ide

[issue10792] Compile() and 'Windows/Mac newlines'

2010-12-29 Thread Terry J. Reedy
Terry J. Reedy added the comment: I made a mistake in testing. Sorry for the noise. -- resolution: -> invalid status: open -> closed ___ Python tracker ___

[issue10792] Compile() and 'Windows/Mac newlines'

2010-12-29 Thread Terry J. Reedy
New submission from Terry J. Reedy : In python-list thread "Does Python 3.1 accept \r\n in compile()?" jmfauth notes that compile('print(999)\r\n', '', 'exec') works in 2.7 but not 3.1 (and 3.2 not checked) because 3.1 sees '\r' as SyntaxError. I started to respond that this is part of Py3 clean

[issue8618] test_winsound fails when no playback devices configured

2010-12-29 Thread Brian Curtin
Brian Curtin added the comment: Looks like whatever caused this is now gone. -- resolution: -> fixed stage: -> committed/rejected status: pending -> closed ___ Python tracker _

[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread David Beazley
New submission from David Beazley : Is something like this supposed to work: >>> import gzip >>> import io >>> f = io.TextIOWrapper(gzip.open("foo.gz"),encoding='ascii')) Traceback (most recent call last): File "", line 1, in AttributeError: readable In a nutshell--reading a .gz file as text

[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Alexander Belopolsky
Changes by Alexander Belopolsky : Added file: http://bugs.python.org/file20190/issue10542a.diff ___ Python tracker ___ ___ Python-bugs-list ma

[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: I should stop using e-mail to reply to bug reports! The mangled example was >>> 𐌀 = 5 File "", line 1 𐌀 = 5 ^ SyntaxError: invalid character in identifier -- ___ Python tracker

[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: On Wed, Dec 29, 2010 at 11:36 AM, Georg Brandl wrote: .. > That bug already strikes me as quite exotic. > Would it look as exotic if presented like this? File "", line 1 𐌀 = 5 ^ SyntaxError: invalid character in identifier (works on a wide b

[issue10790] Header.append's charset logic is bogus, 'shift_jis' and "euc_jp' don't work as charsets

2010-12-29 Thread R. David Murray
R. David Murray added the comment: Updated patch that also fixes the docs. -- Added file: http://bugs.python.org/file20189/header_append.patch ___ Python tracker ___ ___

[issue10790] Header.append's charset logic is bogus, 'shift_jis' and "euc_jp' don't work as charsets

2010-12-29 Thread R. David Murray
Changes by R. David Murray : -- nosy: +barry ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python

[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: The example in my previous message should have been: >>> '\U0001' == '\uD800\uDC00' True -- ___ Python tracker ___ _

[issue10790] Header.append's charset logic is bogus, 'shift_jis' and "euc_jp' don't work as charsets

2010-12-29 Thread R. David Murray
New submission from R. David Murray : Working on issue 10686, I've discovered that the logic for charset conversion in email.header.Header.append is bogus. It happens to work for most charsets because for most charsets the input codec and the output codec are the same. For shift_jis and euc_

[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: On Sat, Nov 27, 2010 at 5:24 PM, Marc-Andre Lemburg wrote: .. > Perhaps we should allow ord() to work on surrogates in > UCS4 builds as well. That would reduce the number of > surprises. > This is an interesting idea, however, having surrogates in UCS4 b

[issue10716] Modernize pydoc to use CSS

2010-12-29 Thread Ron Adam
Ron Adam added the comment: It may be useful to change those to 'id=' and 'class=' if possible. It isn't clear to me how much of pydoc is still part of the public api in python 3.x. pydoc.__all__ is set only to ['help']. Entering help(pydoc) just gives the basic help and command line argum

[issue10348] multiprocessing: use SysV semaphores on FreeBSD

2010-12-29 Thread Jesse Noller
Jesse Noller added the comment: Adding, or moving, to SYSV semaphores is very low on the list of things to do. If someone were to provide a patch, I'm sure we could consider it. -- ___ Python tracker

[issue5725] process SysV-Semaphore support

2010-12-29 Thread Jesse Noller
Changes by Jesse Noller : -- nosy: +jnoller ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.o

[issue10789] Lock.acquire documentation is misleading

2010-12-29 Thread Jyrki Pulliainen
New submission from Jyrki Pulliainen : In threading module, the Lock.acquire documentation is misleading. The signature suggests that the blocking can be given as a keyword argument but that would lead to an TypeError, as thread.lock.acquire does not accept keyword arguments. The signature in

[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: On Wed, Dec 29, 2010 at 7:19 AM, Marc-Andre Lemburg wrote: .. > * The macros still need some more attention to enhance their performance. > Although I made your suggested change from '-' to '&', I seriously doubt that this would make any difference on mod

[issue6210] Exception Chaining missing method for suppressing context

2010-12-29 Thread Ethan Furman
Changes by Ethan Furman : -- versions: +Python 3.3 -Python 3.2 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: h

[issue6210] Exception Chaining missing method for suppressing context

2010-12-29 Thread Ethan Furman
Ethan Furman added the comment: > I said the *except* block, not the *try* block ;) Ah. So you did. Okay, if I'm understanding correctly, the scenario you are talking about involves the code in the except block calling some other function, and that other function is raising an exception...

[issue10716] Modernize pydoc to use CSS

2010-12-29 Thread Georg Brandl
Georg Brandl added the comment: Well, you could reuse these arguments to mean CSS classes, and have styles like ".red { color: red }" :) -- nosy: +georg.brandl ___ Python tracker _

[issue10716] Modernize pydoc to use CSS

2010-12-29 Thread Ron Adam
Ron Adam added the comment: The HtmlDoc class has methods that take colors. Can this be changed or does it need to be depreciated first? def heading(self, title, fgcol, bgcol, extras=''): """Format a page heading.""" return '''    %s%s ''' % (bgcol, fgcol, title, fgc

[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Georg Brandl
Georg Brandl added the comment: That bug already strikes me as quite exotic. You need to at least address Marc-Andre's remarks, and to give an overview of what else you'd like to change as well, and how this could affect semantics. Remember that the next release is already a release candidate

[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: On Wed, Dec 29, 2010 at 10:00 AM, Georg Brandl wrote: .. > >> Let's wait for 3.3 with the change. > > Definitely. Does this also mean that the numerous surrogates related bugs should wait until 3.3 as well? (See issues #9200 and #10521.) This patch was

[issue10788] test_logging failure

2010-12-29 Thread Vinay Sajip
Vinay Sajip added the comment: These failures in build 363 (using r87563) would occur if some stdlib code added a handler to the root logger before the start of test_logging. I see that build 364 doesn't show this failure, and it's testing r87564. From what I can see, the only changes in r875

[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Georg Brandl
Georg Brandl added the comment: > Let's wait for 3.3 with the change. Definitely. -- nosy: +georg.brandl versions: +Python 3.3 -Python 3.2 ___ Python tracker ___ __

[issue7511] msvc9compiler.py: ValueError: [u'path']

2010-12-29 Thread Thorsten Behrens
Thorsten Behrens added the comment: Confirmed that this issue exists on Python 3.1 and 3.2b2. The exception thrown presents as: ValueError: ['path', 'include', 'lib'] -- ___ Python tracker ___

[issue10753] request_uri method of wsgiref module does not support RFC1808 params.

2010-12-29 Thread R. David Murray
R. David Murray added the comment: In this case I think it is safe enough, since it only results in the ;,= not getting encoded. If an application were doing anything with the encoded chars, it would probably be decoding them, and now that step will simply become a noop. Of course, breakage

[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Alexander Belopolsky wrote: > > Alexander Belopolsky added the comment: > > I am attaching a patch for commit review. I added an underscore prefix to > all new macros. This way I am not introducing new features and we will have > a full release cycle

[issue6210] Exception Chaining missing method for suppressing context

2010-12-29 Thread Antoine Pitrou
Antoine Pitrou added the comment: Le mercredi 29 décembre 2010 à 01:15 +, Ethan Furman a écrit : > Ethan Furman added the comment: > > > I'm talking about the exception raised from the except block. > > So was I -- why should this: > > try: > x = y / z > except ZeroDivisionError as e

[issue9742] Python 2.7: math module fails to build on Solaris 9

2010-12-29 Thread Matt Selsky
Changes by Matt Selsky : -- nosy: +Matt.Selsky ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.pytho

[issue5672] Implement a way to change the python process name

2010-12-29 Thread Floris Bruynooghe
Floris Bruynooghe added the comment: There are actually a few implementations on pypi, just search for prctl. At least one of them is pretty decent IIRC but I can't remember which one I looked at in detail before. Anyway, they would certainly be a reasonable starting point for python inclusion

[issue10787] [random.gammavariate] Add the expression of the distribution in a comprehensive form for random.gammavariate

2010-12-29 Thread Mark Dickinson
Changes by Mark Dickinson : -- nosy: +mark.dickinson ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mai

[issue6210] Exception Chaining missing method for suppressing context

2010-12-29 Thread Nick Coghlan
Nick Coghlan added the comment: For "can't tell" in my previous message, read "we aren't going to impose the requirement to be able to tell if an exception is being raised directly in the current exception handler as a feature of conforming Python implementations". We probably *could* tell th

[issue6210] Exception Chaining missing method for suppressing context

2010-12-29 Thread Nick Coghlan
Nick Coghlan added the comment: No, the context must always be included unless explicitly suppressed. The interpreter can't reliably tell the difference between a raise statement in the current exception handler and one buried somewhere inside a nested function call. The whole point is to giv