[issue12737] str.title() is overzealous by upcasing combining marks inappropriately
Martin v. Löwis mar...@v.loewis.de added the comment: Tom: it's intentional that .title() doesn't use traditional word break algorithms. In 2.x, foo3bar.title() is Foo3Bar, i.e. the 3 counts as a word end. So neither UTS#18 \w nor UAX#29 apply. So in UTS#18 terminology, .title() matches more closes \alpha+, despite UTS#18 saying that this shouldn't be used for word-breaking. It's not clear to me how UTS#18 defines \alpha. On the one hand, they say that marks should be included, OTOH they refer to the Alphabetic derived category which doesn't include marks, except for the few that have been included in Other_Alphatetic. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12737 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10181] Problems with Py_buffer management in memoryobject.c (and elsewhere?)
Changes by Stefan Krah stefan-use...@bytereef.org: Added file: http://bugs.python.org/file23185/4492afe05a07.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10181 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10181] Problems with Py_buffer management in memoryobject.c (and elsewhere?)
Stefan Krah stefan-use...@bytereef.org added the comment: Revision 4492afe05a07 allows memoryview to handle objects with an __index__() method. This is for compatibility with the struct module (See also #8300). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10181 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12960] threading.Condition is not a class
Changes by STINNER Victor victor.stin...@haypocalc.com: -- resolution: - wont fix status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12960 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12999] _XOPEN_SOURCE and _XOPEN_SOURCE_EXTENDED usage on Solaris
STINNER Victor victor.stin...@haypocalc.com added the comment: Martin dropped _XOPEN_SOURCE in issue #1759169 (commit 7c947768b435). -- FYI I changed configure(.in) to get _XOPEN_SOURCE to 700 on OpenBSD 5 to get recent C functions like fdopendir(): # X/Open 7, incorporating POSIX.1-2008 AC_DEFINE(_XOPEN_SOURCE, 700, Define to the level of X/Open that your system supports) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12999 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12996] multiprocessing.Connection endianness issue
STINNER Victor victor.stin...@haypocalc.com added the comment: Since the rewrite in pure Python of multiprocessing.Connection (issue #11743), multiprocessing.Connection sends and receives the length of the data (used as header) in host byte order. I don't think so, the C code uses also the host endian. This issue is a feature request. I don't know if anyone uses multiprocessing on different hosts (because it doesn't work currently). If you would like to support using multiprocessing on different hosts, it should be documented in multiprocessing doc. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12996 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12981] rewrite multiprocessing (senfd|recvfd) in Python
STINNER Victor victor.stin...@haypocalc.com added the comment: It works fine on Linux, FreeBSD, OS X and Windows, but not on Solaris: see issue #12999. Oh, thank for testing before committing :) It's hard to debug multiprocessing. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12981 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13001] test_socket.testRecvmsgTrunc failure on FreeBSD 7.2 buildbot
STINNER Victor victor.stin...@haypocalc.com added the comment: @requires_freebsd_version should be factorized with @requires_linux_version. Can we workaround FreeBSD ( 8) bug in C/Python? Or should we remove the function on FreeBSD 8? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13001 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12996] multiprocessing.Connection endianness issue
Charles-François Natali neolo...@free.fr added the comment: Since the rewrite in pure Python of multiprocessing.Connection (issue #11743), multiprocessing.Connection sends and receives the length of the data (used as header) in host byte order. I don't think so, the C code uses also the host endian. This issue is a feature request. No. http://hg.python.org/cpython/file/5deecc04b7a2/Modules/_multiprocessing/socket_connection.c In conn_send_string(): /* The header of the message is a 32 bit unsigned number (in network order) which specifies the length of the body. If the message is shorter than about 16kb then it is quicker to combine the header and the body of the message and send them at once. */ [...] *(UINT32*)message = htonl((UINT32)length); in conn_recv_string(): ulength = ntohl(ulength); I don't know if anyone uses multiprocessing on different hosts (because it doesn't work currently). If you would like to support using multiprocessing on different hosts, it should be documented in multiprocessing doc. It does work, it's even documented ;-) http://docs.python.org/dev/library/multiprocessing.html#multiprocessing-managers A manager object returned by Manager() controls a server process which holds Python objects and allows other processes to manipulate them using proxies. [...] Server process managers are more flexible than using shared memory objects because they can be made to support arbitrary object types. Also, a single manager can be shared by processes on different computers over a network. They are, however, slower than using shared memory. Managers use multiprocessing.connection to serialize data and send them over a socket: http://hg.python.org/cpython/file/5deecc04b7a2/Lib/multiprocessing/managers.py # # Mapping from serializer name to Listener and Client types # listener_client = { 'pickle' : (connection.Listener, connection.Client), 'xmlrpclib' : (connection.XmlListener, connection.XmlClient) } Yeah, Python's awesome :-) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12996 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12981] rewrite multiprocessing (senfd|recvfd) in Python
Charles-François Natali neolo...@free.fr added the comment: It works fine on Linux, FreeBSD, OS X and Windows, but not on Solaris: see issue #12999. Oh, thank for testing before committing :) It's hard to debug multiprocessing. Yes. Especially when you stumble upon a kernel/libc bug 25% of the time... So, what should I do? Apply the test catching the multiprocessing.connection ImportError to test_multiprocessing (which is necessary even with the current C version)? And then apply the pure Python version, or wait until the OpenIndiana case gets fixed? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12981 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12958] test_socket failures on Mac OS X
Nick Coghlan ncogh...@gmail.com added the comment: OK, I've now looked into *why* the socket tests are throwing errors in tearDown, and it has to do with the way the threaded client/server tests in test_socket are set up. Specifically, ThreadableTest uses tearDown to reraise any exception raised in the client thread, and these are therefore outside the scope of the expectedFailure suppression in unittest. Now that I've tracked this down, it would be fairly straightforward to fix this specifically within test_socket.ThreadableTest by appropriately adjusting the definition of ThreadableTest.clientRun to discard exceptions encountered in tests flagged as expected failures. However, I'm wondering if that's the right thing to do. Perhaps it would make more sense to change unittest itself so that expectedFailure also suppresses tearDown errors. It doesn't seem all that unusual for a known failing test to also cause problems for the tearDown code. Added Michael to the nosy list to ask for his advice/opinion. In the meantime, I'll work on a patch that adjusts ThreadableTest directly. -- assignee: - ncoghlan nosy: +michael.foord ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12958 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1294232] Error in metaclass search order
Nick Coghlan ncogh...@gmail.com added the comment: Looking at Daniel's updated patch is still on my to-do list, but I won't object if anyone else wants to take this forward (it will be at least a few weeks before I get to it). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1294232 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12938] html.escape docstring does not mention single quotes (')
Senthil Kumaran sent...@uthcode.com added the comment: This is fixed in all revisions. -- resolution: - fixed stage: - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12938 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11320] Can't call Py_SetPath() on pointer returned by Py_GetPath()
Debao Zhang dbzhang...@gmail.com added the comment: Hello everyone, I have found the reason for the problem. From the manual http://docs.python.org/py3k/c-api/init.html#Py_SetPath , we can see that: After we call Py_SetPath,both sys.prefix and sys.exec_prefix will be empty. However, sys.prefix will be used in the sysconfig.py to generate the makefile' s name, and the empty sys.prefix will cause the wrong path: lib/python3.2/config-3.2m/Makefile sysconfig.py imported by site.py, and site.py used in the Py_InitializeEx. So ... -- nosy: +dbzhang800 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue11320 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12972] Color prompt + readline
Terry J. Reedy tjre...@udel.edu added the comment: Since 2.7 was released after 3.1, I will assumed any bugfix was applied there also until someone determines otherwise. Thanks for checking. -- resolution: - out of date status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12972 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12450] Use the Grisu algorithms to convert floats to strings
Mark Dickinson dicki...@gmail.com added the comment: It's biggest deficiency (compared to Gay's dtoa.c) is its specialization to IEEE doubles. We're only using the portion of Gay's code (with some significant modifications at this point) that applies to the IEEE 754 binary64 format, so I don't think this is a concern. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12450 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13004] pprint: add option to truncate seqeunces
New submission from Terry J. Reedy tjre...@udel.edu: From python-ideas thread truncate sequences in pretty-print ? On Sat, Sep 17, 2011 at 9:59 PM, Steven Samuel Cole i use pprint quite a bit during development to give me quick insight into what is going on inside my application. however, when there's any sequences involved, the output becomes less useful the longer these sequences are - i often find myself scrolling around (or even searching) in the terminal window, trying to find the bit of output i am interested in. imo, it would be great if pprint had a parameter 'max_len' or so that reduces output of every sequence to a maximum and inserts something like '...' to indicate truncation, e.g. {'my key': ['my list item 01', 'my list item 02', 'my list item 03', 'my list item 04', 'my list item 05', 'my list item 06', 'my list item 07', 'my list item 08', '...', 'my list item 10',]} somewhat comparable to the '...' already printed when a structure is more deeply nested than you want to know right now. On 9/18/2011 10:59 AM, Guido van Rossum wrote: Agreed, this would be a useful feature. I've reimplemented something like pprint a few times and always had to implement this truncation feature. If you or someone can contribute a patch that would be much appreciated! -- components: Library (Lib) messages: 144247 nosy: terry.reedy priority: normal severity: normal stage: test needed status: open title: pprint: add option to truncate seqeunces type: feature request versions: Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13004 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13004] pprint: add option to truncate seqeunces
Changes by Ezio Melotti ezio.melo...@gmail.com: -- nosy: +ezio.melotti ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13004 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12958] test_socket failures on Mac OS X
Michael Foord mich...@voidspace.org.uk added the comment: See issue 10548. There is some resistance to expectedFailure masking errors in setUp/tearDown as these aren't the place where you would normally expect the expected failure... -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12958 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13000] unhandled exception at install
Martin v. Löwis mar...@v.loewis.de added the comment: Can you please run msiexec /i python2.7.2.msi /l*v python.log and compress and attach the resulting python.log? I'm skeptical though that we will be able to do anything about this issue. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13000 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12976] add support for MirBSD platform
Benny Siegert bsieg...@gmail.com added the comment: I agree that the patch is quite small. I am regularly building new Python versions (using pkgsrc) so I can maintain the patch for future releases. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12976 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13005] operator module docs include repeat
New submission from Luciano Ramalho luci...@ramalho.org: The operator module documentation for versions 3.2 and 3.3 includes the repeat function in a table 9.3.1. Mapping Operators to Functions [1], but fails to mention that the repeat function is deprecated and mul should be used instead, as described in the 2.7 version of the operator module docs [2]. The main entry for the repeat function was removed in the 3.2 and 3.3 docs, only the mention in the table remains [1]. [1] http://docs.python.org/py3k/library/operator#mapping-operators-to-functions [2] http://docs.python.org/library/operator#operator.__repeat__ -- assignee: docs@python components: Documentation messages: 144251 nosy: docs@python, luciano priority: normal severity: normal status: open title: operator module docs include repeat versions: Python 3.2, Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13005 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13005] operator module docs include repeat
Changes by Ezio Melotti ezio.melo...@gmail.com: -- nosy: +ezio.melotti, sandro.tosi stage: - needs patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13005 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12981] rewrite multiprocessing (senfd|recvfd) in Python
David Watson bai...@users.sourceforge.net added the comment: I had a look at this patch, and the FD passing looked OK, except that calculating the buffer size with CMSG_SPACE() may allow more than one file descriptor to be received, with the extra one going unnoticed - it should use CMSG_LEN() instead (the existing C implementation has the same problem, I see). CMSG_SPACE() exists to allow calculating the space required to hold multiple control messages, so it essentially gives the offset for the next cmsghdr struct such that any alignment requirements are satisfied. 64-bit systems will probably want to ensure that all CMSG_DATA() payloads are aligned on 8-byte boundaries, and so have CMSG_SPACE(4) == CMSG_SPACE(8) == CMSG_LEN(8) (the Linux headers, for instance, align to sizeof(size_t)). So with a 32-bit int, a buffer size of CMSG_SPACE(sizeof(int)) would allow *two* file descriptors to be received. CMSG_LEN() omits the padding, thus allowing only one. I'm not familiar with how the FD-passing facility is used in multiprocessing, but this seems as if it could be an avenue for DoS attacks that exhaust the number of file descriptors allowed for the receiving process. -- nosy: +baikie ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12981 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8623] Aliasing warnings in socketmodule.c
David Watson bai...@users.sourceforge.net added the comment: For reference, the warnings are partially explained here: http://gcc.gnu.org/onlinedocs/gcc-4.6.1/gcc/Optimize-Options.html#index-fstrict_002daliasing-825 http://gcc.gnu.org/onlinedocs/gcc-4.6.1/gcc/Warning-Options.html#index-Wstrict_002daliasing-337 I get these warnings with GCC (Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5 [i386], plus an additional one from the new recvmsg() code. I haven't tried GCC 4.5 or later, but as the docs imply, the warnings will not appear in debugging builds. I take it GCC is referring to C99 section 6.5, paragraphs 6 and 7 here, but I'm not sure exactly how much these are intended to prohibit with regard to the (mis)use of unions, or how strictly GCC actually enforces them. The attached socket-aliasing-sas2sa.diff is enough to get rid of the warnings with GCC 4.4.4 - it adds add a struct sockaddr member to the sock_addr_t union type, changes the SAS2SA() macro to take the address of this member instead of using a cast, and modifies socket_gethostbyaddr() and socket_gethostbyname_ex() to use SAS2SA() (sock_recvmsg_guts() already uses it). Changing SAS2SA() also gets rid of most of the additional warnings produced by the aggressive warning setting -Wstrict-aliasing=2. However, the gethostby* functions still point to the union object with a pointer variable not matching the type actually stored in it, which the GCC docs warn against. To be more conservative, socket-aliasing-union-3.2.diff applies on top to get rid of these pointers, and instead directly access the union for each use other than providing a pointer argument to a function. socket-aliasing-union-recvmsg-3.3.diff does the same for 3.3, and makes the complained-about line in sock_recvmsg_guts() access the union directly as well. One other consideration here is that the different sockaddr_* struct types used are likely to come under the common initial sequence rule for unions (C99 6.5.2.3, paragraph 5, or section A8.3 of KR 2nd ed.), which might make some more questionable uses valid. That said, technically POSIX appears to require only that the s*_family members of the various sockaddr struct types have the same offset and type, not that they form part of a common initial sequence (s*_family need not be the first structure member - the BSDs for instance place it second, although it can still be part of a common initial sequence). -- keywords: +patch nosy: +baikie versions: +Python 3.3 Added file: http://bugs.python.org/file23186/socket-aliasing-sas2sa.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8623 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8623] Aliasing warnings in socketmodule.c
Changes by David Watson bai...@users.sourceforge.net: Added file: http://bugs.python.org/file23187/socket-aliasing-union-3.2.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8623 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8623] Aliasing warnings in socketmodule.c
Changes by David Watson bai...@users.sourceforge.net: Added file: http://bugs.python.org/file23188/socket-aliasing-union-3.3.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8623 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13001] test_socket.testRecvmsgTrunc failure on FreeBSD 7.2 buildbot
Changes by David Watson bai...@users.sourceforge.net: -- nosy: +baikie ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13001 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12976] add support for MirBSD platform
Martin v. Löwis mar...@v.loewis.de added the comment: Ok, closing this as won't fix, them. -- resolution: - wont fix status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12976 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12981] rewrite multiprocessing (senfd|recvfd) in Python
Changes by Charles-François Natali neolo...@free.fr: Removed file: http://bugs.python.org/file23180/multiprocessing_fd-2.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12981 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12981] rewrite multiprocessing (senfd|recvfd) in Python
Charles-François Natali neolo...@free.fr added the comment: I had a look at this patch, and the FD passing looked OK, except that calculating the buffer size with CMSG_SPACE() may allow more than one file descriptor to be received, with the extra one going unnoticed - it should use CMSG_LEN() instead Thanks for catching this. Here's an updated patch. (the existing C implementation has the same problem, I see). I just checked, and the C version uses CMSG_SPACE() as the buffer size, but passes CMSG_LEN() to cmsg-cmsg_len and msg.msg_controllen. Or am I missing something? -- Added file: http://bugs.python.org/file23189/multiprocessing_fd-3.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12981 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug
Tom Christiansen tchr...@perl.com added the comment: Terry J. Reedy rep...@bugs.python.org wrote on Thu, 08 Sep 2011 18:56:11 -: On 9/8/2011 4:32 AM, Ezio Melotti wrote: So to summarize a bit, there are different possible level of strictness: 1) all the possible encodable values, including the ones10; 2) values in range 0..10; 3) values in range 0..10 except surrogates (aka scalar values); 4) values in range 0..10 except surrogates and noncharacters; and this is what is currently available in Python: 1) not available, probably it will never be; 2) available through the 'surrogatepass' error handler; 3) default behavior (i.e. with the 'strict' error handler); 4) currently not available. Now, assume that we don't care about option 1 and want to implement the missing option 4 (which I'm still not 100% sure about). The possible options are: * add a new codec (actually one for each UTF encoding); * add a new error handler that explicitly disallows noncharacters; * change the meaning of 'strict' to match option 4; If 'strict' meant option 4, then 'scalarpass' could mean option 3. 'surrogatepass' would then mean 'pass surragates also, in addition to non-char scalers'. I'm pretty sure that anything that claims to be UTF-{8,16,32} needs to reject both surrogates *and* noncharacters. Here's something from the published Unicode Standard's p.24 about noncharacter code points: • Noncharacter code points are reserved for internal use, such as for sentinel values. They should never be interchanged. They do, however, have well-formed representations in Unicode encoding forms and survive conversions between encoding forms. This allows sentinel values to be preserved internally across Unicode encoding forms, even though they are not designed to be used in open interchange. And here from the Unicode Standard's chapter on Conformance, section 3.2, p. 59: C2 A process shall not interpret a noncharacter code point as an abstract character. • The noncharacter code points may be used internally, such as for sentinel values or delimiters, but should not be exchanged publicly. I'd have to check the fine print, but I am pretty sure that shall not is an imperative form. We have understand that to read that a comforming process *must*not* do that. It's because of that wording that in Perl, using either of {en,de}code() with any of the UTF-{8,16,32} encodings, including the LE/BE versions as appropriate, it will not produce nor accept a noncharacter code point like FDD0 or FFFE. Do you think we may perhaps have misread that conformance clause? Using Perl's special, loose-fitting utf8 encoding, you can get it do noncharacter code points and even surrogates, but you have to suppress certain things to make that happen quietly. You can only do this with utf8, not any of the UTF-16 or UTF-32 flavors. There we give them no choice, so you must be strict. I agree this is not fully orthogonal. Note that this is the normal thing that people do: binmode(STDOUT, :utf8); which is the *loose* version. The strict one is utf8-strict or UTF-8: open(my $fh, :encoding(UTF-8), $pathname) So it is a bit too easy to get the loose one. We felt we had to do this because we were already using the loose definition (and allowing up to chr(2**32) etc) when the Unicode Consortium made clear what sorts of things must not be accepted, or perhaps, before we made ourselves clear on this. This will have been back in 2003, when I wasn't paying very close attention. I think that just like Perl, Python has a legacy of the original loose definition. So some way to accommodate that legacy while still allowing for a comformant application should be devised. My concern with Python is that people tend to make they own manual calls to encode/decode a lot more often than they do in Perl. That people that if you only catch it on a stream encoding, you'll miss it, because they will use binary I/O and miss the check. --tom Below I show a bit of how this works in Perl. Currently the builtin utf8 encoding is controlled somewhat differently from how the Encode module's encode/decode functions are. Yes, this is not my idea of good. This shows that noncharacters and surrogates do not survive the encoding/decoding process for UTF-16: % perl -CS -MEncode -wle 'print decode(UTF-16, encode(UTF-16, chr(0xFDD0)))' | uniquote -v \N{REPLACEMENT CHARACTER} % perl -CS -MEncode -wle 'print decode(UTF-16, encode(UTF-16, chr(0xFFFE)))' | uniquote -v \N{REPLACEMENT CHARACTER} % perl -CS -MEncode -wle 'print decode(UTF-16, encode(UTF-16, chr(0xD800)))' | uniquote -v UTF-16 surrogate U+D800 in subroutine entry at /usr/local/lib/perl5/5.14.0/darwin-2level/Encode.pm line 158. If you pass a third argument to encode/decode, you can
[issue1294232] Error in metaclass search order
Changes by Meador Inge mead...@gmail.com: -- nosy: +meador.inge stage: test needed - patch review ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1294232 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12958] test_socket failures on Mac OS X
Nick Coghlan ncogh...@gmail.com added the comment: OK, I'll just deal with the problem directly in test_socket then. It looks like my latest attempt (suppressing unittest._ExpectedFailure in test_socket.ThreadableTest.clientRun) did the trick, so I'll push the updated tests some time this evening: http://www.python.org/dev/buildbot/all/builders/AMD64%20Snow%20Leopard%202%20custom/builds/44/steps/test/logs/stdio -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12958 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10548] Error in setUp not reported as expectedFailure (unittest)
Nick Coghlan ncogh...@gmail.com added the comment: As another data point, this question came up again in the context of issue #12958. The new test_socket.ThreadableTest uses tearDown() to pick up and reraise any exception that occurred in the client thread. This meant that my initial attempts at flagging some expected failures (due to Mac OS X limitations) didn't work - the client half of the failure was thrown in tearDown() and reported as an error. While I've determined how to avoid that problem in the test_socket case, the general question of whether or not we consider it legitimate to put common assertions in setUp() and tearDown(), or expect that test code explicitly cope with tearDown() failures that occur due to expected test failures still needs to be addressed. To my mind, bugs in my test infrastructure are going to get flushed out by tests that I'm neither skipping nor marking as expected failures. If I have a test that is known to fail in a way that invalidates the standard tearDown procedure for the test infrastructure, having to special case that situation in the tearDown code seems to go against the spirit of offering the expectedFailure decorator in the first place. I don't think the same reasoning holds for setUp though - there's no way for a failing test to reach back and force setUp to fail, so any errors raised there are always going to be infrastructure errors. -- nosy: +ncoghlan ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10548 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10548] Error in setUp not reported as expectedFailure (unittest)
Michael Foord mich...@voidspace.org.uk added the comment: I think Twisted uses the tearDown to fail tests as well. As we have two use cases perhaps we should allow expectedFailure to work with failues in tearDown? (And if we do that it should cover setUp as well for symmetry or it becomes a morass of special cases.) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10548 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug
Terry J. Reedy tjre...@udel.edu added the comment: My long-ago memory is that 'should not' is slightly looser in w3c parlance than 'must not'. However, it is a moot point if we decide to follow the 'should' in 3.3 for the default 'strict' mode, which both Ezio and I think we 'should' ;-). Our 'errors' parameter makes it easy to request something else, but it has to be explicit. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12729 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13006] bug in core python variable binding
New submission from Stephen Vavasis vava...@uwaterloo.ca: There seems to be a serious bug in how python 2.7.2 binds variables to values. In the attached function buildfunclist, you see that there is a variable called 'funclist' that is initialized to [], and then is modified only with 'append' calls. This means that once append is called 46 times, one expects that funclist[45] is defined and will not change? And yet funclist[45] changes several times as more data items are appended. The same bug is present in 3.2.2. My operating system is Windows 7 64-bit on a Lenovo Thinkpad T410. I'm guessing that there is a problem with python's lazy copying-- it is a bit too lazy and failing to make copies when lists are changed. To exhibit this bug, proceed as follows: import pickle h = open('combined_oplists_pickle','r') combined_oplists = pickle.Unpickler(h).load() import pybugreport funclist,funcdist = pybugreport.buildfunclist(combined_oplists) and then you will see funclist[45] printed out on two successive iterations. It has changed as a result of an append operation, which should not happen. (It's 6th entry is longer.) Here is the output: funclist[45] = [0, 22973, '$FUNC', 'splitBoxInterior', [['InArg', [[['', 'ActiveBoxVectorI', '::', 'iterator', ''], ['thisboxdata_p', '']], [['', 'FaceIndex', ''], ['faceind', '', ['InOutArg', [[['', 'ActiveBoxVectorI', ''], ['interiorOrbitNextLev', '', ['RefGlobal', [[['', 'MIndex', ''], ['guiActiveBoxCount', '', ['Workspace', [[['', 'QMGVector', '', '', 'BoxCreationData', '', ' ', ''], ['boxCreationVec', ''], [], [[0, 23017], [0, 23048], [0, 23068], [0, 23069]], [[0, 23001]]] funclist[45] = [0, 22973, '$FUNC', 'splitBoxInterior', [['InArg', [[['', 'ActiveBoxVectorI', '::', 'iterator', ''], ['thisboxdata_p', '']], [['', 'FaceIndex', ''], ['faceind', '', ['InOutArg', [[['', 'ActiveBoxVectorI', ''], ['interiorOrbitNextLev', '', ['RefGlobal', [[['', 'MIndex', ''], ['guiActiveBoxCount', '', ['Workspace', [[['', 'QMGVector', '', '', 'BoxCreationData', '', ' ', ''], ['boxCreationVec', ''], [[0, 23115], [0, 23116], [0, 23117], [0, 23118], [0, 23119], [0, 23120], [0, 23121], [0, 23122], [0, 23123], [0, 23124], [0, 23125], [0, 23126], [0, 23127], [0, 23128], [0, 23129], [0, 23130], [0, 23131], [0, 23132], [0, 23133], [0, 23134], [0, 23135], [0, 23136], [0, 23139], [0, 23140], [0, 23141], [0, 23142], [0, 23143], [0, 23144], [0, 23145], [0, 23146]], [[0, 23017], [0, 23048], [0, 23068], [0, 23069], [0, 23137], [0, 23138], [0, 23147], [0, 23148], [0, 23149], [0, 23161]], [[0, 23001]]] -- components: Interpreter Core files: pybugreport.zip messages: 144261 nosy: vavasis priority: normal severity: normal status: open title: bug in core python variable binding type: behavior versions: Python 2.7 Added file: http://bugs.python.org/file23190/pybugreport.zip ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13006 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com