[issue2636] Regexp 2.7 (modifications to current re 2.2.2)
Steven D'Aprano steve+pyt...@pearwood.info added the comment: I'm not sure if this belongs here, or on the Google code project page, so I'll add it in both places :) Feature request: please change the NEW flag to something else. In five or six years (give or take), the re module will be long forgotten, compatibility with it will not be needed, so-called new features will no longer be new, and the NEW flag will just be silly. If you care about future compatibility, some sort of version specification would be better, e.g. VERSION=0 (current re module), VERSION=1 (this regex module), VERSION=2 (next generation). You could then default to VERSION=0 for the first few releases, and potentially change to VERSION=1 some time in the future. Otherwise, I suggest swapping the sense of the flag: instead of re behaviour unless NEW flag is given, I'd say re behaviour only if OLD flag is given. (Old semantics will, of course, remain old even when the new semantics are no longer new.) -- nosy: +stevenjd ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2636 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12731] python lib re uses obsolete sense of \w in full violation of UTS#18 RL1.2a
Ezio Melotti ezio.melo...@gmail.com added the comment: Or the re module should be *replaced* by the code from the regex module (but renamed to re, and with certain backwards compatibilities restored, probably). This is what I meant. But I really hope the re module (really: the _sre extension module) can be fixed. Start fixing these issues from scratch doesn't make much sense IMHO. We could extract the fixes from regex and merge them in re, but then again it's probably easier to just replace the whole module. We should also make a habit in our docs of citing specific versions of the Unicode standard, and specific TR numbers and versions where they apply. While this is a good thing it's not always doable. Usually someone reports a bug related to something specified in some standard and only that part gets fixed. Sometimes everything else is also updated to follow the whole standard, but often this happens incrementally, so we can't say, e.g., the re module supports Unicode x.y unless we go through the whole standard and fix/implements everything. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12731 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12731] python lib re uses obsolete sense of \w in full violation of UTS#18 RL1.2a
Ezio Melotti ezio.melo...@gmail.com added the comment: But I really hope the re module (really: the _sre extension module) can be fixed. If you mean on 2.7/3.2, then I guess we could extract the fixes from regex, but we have to see if it's doable and someone will have to do it. Also consider that the regex module is available for 2.7/3.2, so we could suggest the users to use it if they have problems with the re bugs (even if that means having an additional dependency). ISTM that current plan is: * replace re with regex (and rename it) on 3.3 and fix all these bugs; * leave 2.7 and 3.2 with the old re and its bugs; * let people use the external regex module on 2.7/3.2 if they need to. If this is not ok, maybe it should be discussed on python-dev. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12731 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12839] zlibmodule cannot handle Z_VERSION_ERROR zlib error
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset ba5000307b5d by Nadeem Vawda in branch '2.7': Issue #12839: Fix crash in zlib module due to version mismatch. http://hg.python.org/cpython/rev/ba5000307b5d New changeset cc9e794bf94f by Nadeem Vawda in branch '3.2': Issue #12839: Fix crash in zlib module due to version mismatch. http://hg.python.org/cpython/rev/cc9e794bf94f New changeset b384231df332 by Nadeem Vawda in branch 'default': Merge: #12839: Fix crash in zlib module due to version mismatch. http://hg.python.org/cpython/rev/b384231df332 -- nosy: +python-dev ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12839 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12759] (?P=) input for Tools/scripts/redemo.py raises unnhandled exception
Alexander fred...@mail.ru added the comment: I would like to make a patch. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12759 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12720] Expose linux extended filesystem attributes
Benjamin Peterson benja...@python.org added the comment: And here is the next version, taking into account neologix's review. -- Added file: http://bugs.python.org/file23056/xattrs.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12720 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12287] ossaudiodev: stack corruption with FD = FD_SETSIZE
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset ff6adb867f40 by Charles-François Natali in branch '2.7': Issue #12287: Fix a stack corruption in ossaudiodev module when the FD is http://hg.python.org/cpython/rev/ff6adb867f40 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12287 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12287] ossaudiodev: stack corruption with FD = FD_SETSIZE
STINNER Victor victor.stin...@haypocalc.com added the comment: The _socket module doesn't compile anymore on Windows: Build started: Project: _socket, Configuration: Debug|Win32 Compiling... socketmodule.c 29..\Modules\socketmodule.c(1649) : warning C4013: '_PyIsSelectable_fd' undefined; assuming extern returning int Linking... Creating library d:\cygwin\home\db3l\buildarea\2.7.bolen-windows\build\PCbuild\\_socket_d.lib and object d:\cygwin\home\db3l\buildarea\2.7.bolen-windows\build\PCbuild\\_socket_d.exp socketmodule.obj : error LNK2019: unresolved external symbol __PyIsSelectable_fd referenced in function _sock_accept d:\cygwin\home\db3l\buildarea\2.7.bolen-windows\build\PCbuild\\_socket_d.pyd : fatal error LNK1120: 1 unresolved externals Build log was saved at file://d:\cygwin\home\db3l\buildarea\2.7.bolen-windows\build\PCbuild\Win32-temp-Debug\_socket\BuildLog.htm _socket - 2 error(s), 1 warning(s) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12287 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12287] ossaudiodev: stack corruption with FD = FD_SETSIZE
Charles-François Natali neolo...@free.fr added the comment: STINNER Victor victor.stin...@haypocalc.com added the comment: The _socket module doesn't compile anymore on Windows: Fixed (that's why I wanted a Windows expert to have a look at this patch :-). You might replace #if defined(_MSC_VER) with #if defined (MS_WINDOWS), but in another commit. I'd rather not modify code I don't understand. Plus, I have a really poor Windows karma... -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12287 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12837] Patch for issue #12810 removed a valid check on socket ancillary data
Charles-François Natali neolo...@free.fr added the comment: That has since been changed. I'm reading from POSIX.1-2008, which says: I see. The warning against using values larger than 2**32 - 1 is still there, I presume because they would not fit in a 32-bit signed int. I assume you mean 2**31 - 1. I take it you mean CMSG_FIRSTHDR here Indeed. IIRC, I saw an implementation in old FreeBSD headers that did not check msg_controllen, and hence did not return NULL as RFC 3542 requires. Alright, that's all I wanted to know. That said, the fact remains that the compiler warning is spurious if msg_controllen can be signed on some systems, and I still don't think decreasing the robustness of the code (particularly against any future modifications to that code) just for the sake of silencing a spurious warning is a good thing to do. People can read the comment above the offending line and see that the compiler has got it wrong. Well, the compiler does not get it wrong. If socklen_t is defined as an unsigned int, it has no way of knowing that it might be defined as signed int on other platforms. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12837 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12841] Incorrect tarfile.py extraction
Lars Gustäbel l...@gustaebel.de added the comment: The patch is fine. Thank you very much for it, Sebastien. I think we have to go without a unit test. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12841 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12287] ossaudiodev: stack corruption with FD = FD_SETSIZE
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset 852ca32eb18d by Charles-François Natali in branch '3.2': Issue #12287: Fix a stack corruption in ossaudiodev module when the FD is http://hg.python.org/cpython/rev/852ca32eb18d New changeset ad1c09b6a5b9 by Charles-François Natali in branch 'default': Issue #12287: Fix a stack corruption in ossaudiodev module when the FD is http://hg.python.org/cpython/rev/ad1c09b6a5b9 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12287 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12837] Patch for issue #12810 removed a valid check on socket ancillary data
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset 3ed2d087e70d by Charles-François Natali in branch 'default': Issue #12837: POSIX.1-2008 allows socklen_t to be a signed integer: re-enable http://hg.python.org/cpython/rev/3ed2d087e70d -- nosy: +python-dev ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12837 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12837] Patch for issue #12810 removed a valid check on socket ancillary data
Charles-François Natali neolo...@free.fr added the comment: Thanks for the patch. For the record, here's Linus Torvalds' opinion on this whole socklen_t confusion: _Any_ sane library _must_ have socklen_t be the same size as int. Anything else breaks any BSD socket layer stuff. POSIX initially did make it a size_t, and I (and hopefully others, but obviously not too many) complained to them very loudly indeed. Making it a size_t is completely broken, exactly because size_t very seldom is the same size as int on 64-bit architectures, for example. And it has to be the same size as int because that's what the BSD socket interface is. Anyway, the POSIX people eventually got a clue, and created socklen_t. They shouldn't have touched it in the first place, but once they did they felt it had to have a named type for some unfathomable reason (probably somebody didn't like losing face over having done the original stupid thing, so they silently just renamed their blunder). -- resolution: - fixed stage: - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12837 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12720] Expose linux extended filesystem attributes
Antoine Pitrou pit...@free.fr added the comment: Is it normal that listxattr() succeeds but getxattr() fails with ENOTSUPP? os.listxattr(/) [] os.getxattr(/, foo) Traceback (most recent call last): File stdin, line 1, in module OSError: [Errno 95] Operation not supported This is on 2.6.38.8. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12720 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8426] multiprocessing.Queue fails to get() very large objects
Changes by Charles-François Natali neolo...@free.fr: -- components: +Documentation -Library (Lib) nosy: +docs@python priority: normal - low ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8426 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12537] mailbox's _become_message is very fragile
Changes by Kasun Herath kasun...@gmail.com: -- nosy: +kasun ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12537 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12720] Expose linux extended filesystem attributes
Benjamin Peterson benja...@python.org added the comment: 2011/8/28 Antoine Pitrou rep...@bugs.python.org: Antoine Pitrou pit...@free.fr added the comment: Is it normal that listxattr() succeeds but getxattr() fails with ENOTSUPP? os.listxattr(/) [] os.getxattr(/, foo) Traceback (most recent call last): File stdin, line 1, in module OSError: [Errno 95] Operation not supported The reason you're getting ENOSUP is you have to use the proper prefix. user.* for example. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12720 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12720] Expose linux extended filesystem attributes
Benjamin Peterson benja...@python.org added the comment: After Antoine's review... -- Added file: http://bugs.python.org/file23057/xattrs.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12720 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11969] Can't launch multiproccessing.Process on methods
Changes by terry.h terry.her...@gmail.com: -- nosy: +terry.h ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue11969 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4028] Problem compiling the multiprocessing module on sunos5
Charles-François Natali neolo...@free.fr added the comment: Hello, there's some issues compiling the multiprocessing module on the SunOS I have here, where CMSG_LEN, CMSG_ALIGN, CMSG_SPACE and sem_timedwait are absent. CMSG_LEN and friends should be defined by sys/socket.h (as required by POSIX). SunOS 5.10 man page lists them: http://download.oracle.com/docs/cd/E19253-01/816-5173/socket.h-3head/index.html But not the SunOS 5.9 version: http://ewp.rpi.edu/hartford/webgen/sysdocs/C/solaris_9/SUNWaman/hman3head/socket.3head.html it looks like simply defining the first three macros like this works It works, but it's probably not a good idea: if the headers don't define CMSG_LEN and friends, then FD passing will probably not work. It'd be better to not compile multiprocessing_(sendfd|recvfd) if CMSG_LEN is not defined (patch attached). sem_timedwait are absent. Hmmm. Do you have the compilation's log? Normally, if sem_timedwait isn't available, HAVE_SEM_TIMEDWAIT shouldn't be defined, and we should fallback to sem_trywait (by the way, calling sem_trywait multiple times until the timeout expires is not the same has calling sem_timedwait: this will fail in case of heavy contention). So this should build correctly. And this seems even stranger when I read Sebastian's message: so I had to commented out HAVE_SEM_TIMEDWAIT from setup.py, see: elif platform.startswith('sunos5'): macros = dict( HAVE_SEM_OPEN=1, HAVE_FD_TRANSFER=1 ) #HAVE_SEM_TIMEDWAIT=0, libraries = ['rt'] Makes sense. If we define HAVE_SEMTIMEDWAIT=0, then code guarded by #ifdef HAVE_SEMTIMEDWAIT will be compiled, and the linker won't be able to resolve sem_timedwait. The preprocessor just checks that the symbol is defined, not that it has a non-zero value. To sum up: could someone with a SunOS box test the attached patch, and post the compilation logs if it still fails? -- keywords: +patch nosy: +neologix Added file: http://bugs.python.org/file23058/multiprocessing_sendfd.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4028 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12287] ossaudiodev: stack corruption with FD = FD_SETSIZE
Charles-François Natali neolo...@free.fr added the comment: Alright, committed to 2.7, 3.2 an default. Seems to work on all the buildbots, closing. -- resolution: - fixed stage: patch review - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12287 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12731] python lib re uses obsolete sense of \w in full violation of UTS#18 RL1.2a
Guido van Rossum gu...@python.org added the comment: [me] But I really hope the re module (really: the _sre extension module) can be fixed. [Ezio] Start fixing these issues from scratch doesn't make much sense IMHO. We could extract the fixes from regex and merge them in re, but then again it's probably easier to just replace the whole module. I have changed my mind at least half-way. I am open to having regex (with some changes, details TBD) replace re in 3.3. (I am not yet 100% convinced, but I'm not rejecting it as strongly as I was when I wrote that comment in this bug. See the ongoing python-dev discussion on this topic.) We should also make a habit in our docs of citing specific versions of the Unicode standard, and specific TR numbers and versions where they apply. While this is a good thing it's not always doable. Usually someone reports a bug related to something specified in some standard and only that part gets fixed. Sometimes everything else is also updated to follow the whole standard, but often this happens incrementally, so we can't say, e.g., the re module supports Unicode x.y unless we go through the whole standard and fix/implements everything. Hm. I think that for Unicode it may actually be important enough to be consistent in following the whole standard that we should attempt to be consistent and not just chase bug reports. Now, we may consciously decide not to implement a certain recommendation of the standard. E.g. I'm not going to require that IronPython or Jython have string objects that support O(1) indexing of code points, even (assuming PEP 393 gets accepted) CPython will have them. But these decisions should be made explicitly, and documented clearly. Ideally, we need a Unicode czar -- a core developer whose job it is to keep track of Python's compliance with various parts and versions of the Unicode standard and who can nudge other developers towards fixing bugs or implementing features, or update the documentation in case things don't get added. (I like Tom's approach to Java 1.7, where he submitted proposed doc fixes explaining the deviations from the standard. Perhaps a bit passive-aggressive, but it was effective. :-) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12731 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation
Guido van Rossum gu...@python.org added the comment: Thanks Tom for such a clear explanation! I hope someone will implement this. (Matthew, does this affect regex? I am guessing it does, for case-insensitive matching?) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12736 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug
Guido van Rossum gu...@python.org added the comment: PEP-393 will take care of iterating by code points. Only for CPython. IronPython/Jython will still need a separate solution. Where would you have other iterators go? The string module? Something else I have not thought of? Or something new? Undecided. But I think we may want to create a new module which provides various APIs specifically for apps that need care when dealing with Unicode. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12729 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12797] io.FileIO and io.open should support openat
Terry J. Reedy tjre...@udel.edu added the comment: I prefer a new parameter either at the end of the arglist or possibly keyword only. The idea for both variations is to let typical users ignore the option, which would be hard to do if it is part of the prime parameter. The idea for keyword only is that we might want to add other rarely used but useful options. They have no natural order, and having say, 8 positional params is pretty wretched. (I have worked with such APIs.) -- nosy: +terry.reedy ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12797 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12731] python lib re uses obsolete sense of \w in full violation of UTS#18 RL1.2a
Ezio Melotti ezio.melo...@gmail.com added the comment: Ideally, we need a Unicode czar -- a core developer whose job it is to keep track of Python's compliance with various parts and versions of the Unicode standard and who can nudge other developers towards fixing bugs or implementing features, or update the documentation in case things don't get added. We should first do a full review of the latest Unicode standard and see what's missing. I think there might be parts of older Unicode versions (even Unicode 5) that are not yet implemented. Chapter 3 is a good place where to start, but I'm not sure that's enough -- there are a few TRs that should be considered as well. If we manage to catch up with Unicode 6, then it shouldn't be too difficult to review the changes that every new version will introduce and open an issue for each (or a single issue if the changes are limited). FWIW I'm planning to look at the conformance of the UTF codecs and fix them (if necessary) whenever I'll have time. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12731 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12805] Optimizations for bytes.join() et. al
Changes by Terry J. Reedy tjre...@udel.edu: -- nosy: +terry.reedy ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12805 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12815] Coverage of smtpd.py
Changes by Terry J. Reedy tjre...@udel.edu: -- components: +Library (Lib), Tests versions: +Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12815 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12814] Possible intermittent bug in test_array
Terry J. Reedy tjre...@udel.edu added the comment: Which Python version? 3.3? -- components: +Library (Lib), Tests nosy: +terry.reedy ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12814 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12808] Coverage of codecs.py
Changes by Terry J. Reedy tjre...@udel.edu: -- components: +Library (Lib), Tests versions: +Python 3.3 -Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12808 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12816] smtpd uses library outside of the standard libraries
Changes by Terry J. Reedy tjre...@udel.edu: -- nosy: +terry.reedy ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12816 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12829] pyexpat segmentation fault caused by multiple calls to Parse()
Terry J. Reedy tjre...@udel.edu added the comment: A note for anyone else: David is actually using the xml.parsers.expat module, which uses the now undocumented pyexpat module, whose direct use is deprecated. David: Have you tested with 3.1 or 3.2? (I am about to try on Windows ;-). -- nosy: +terry.reedy ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12829 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12829] pyexpat segmentation fault caused by multiple calls to Parse()
Terry J. Reedy tjre...@udel.edu added the comment: Running with IDLE on Windows, I get no crash or uncaught exception but got these printed lines: An error occurred during XML parsing. Error ID: 9. Error message: junk after document element Line number: 1 An error occurred during XML parsing. Error ID: 9. Error message: junk after document element Line number: 1 An error occurred during XML parsing. Error ID: 9. Error message: junk after document element An error occurred during XML parsing. Error ID: 9. Error message: junk after document element Line number: 1 An error occurred during XML parsing. Error ID: 9. Error message: junk after document element Line number: 1 An error occurred during XML parsing. Error ID: 9. Error message: junk after document element Is this the correct, expected output? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12829 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12836] ctypes.cast() creates circular reference in original object
Terry J. Reedy tjre...@udel.edu added the comment: What action are you suggesting? Change ctypes code or its doc or something else. If the doc, please suggest a specific change. Can you test on 3.x? -- nosy: +terry.reedy title: cast() creates circular reference in original object - ctypes.cast() creates circular reference in original object ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12836 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12843] file object read* methods in append mode overflows
Terry J. Reedy tjre...@udel.edu added the comment: I have confirmed that this only happens in windows. This would literally mean that you tested on several other systems. Did you actually mean 'I have only confirmed that this happens in Windows., that you only tested on Windows? The 2.6 series is in security-fix only mode. -- nosy: +terry.reedy versions: -Python 2.6 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12843 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: The regex module currently uses simple case-folding, although I'm working towards full case-folding, as listed in http://www.unicode.org/Public/UNIDATA/CaseFolding.txt. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12736 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12849] urllib2 headers issue
New submission from Shubhojeet Ghosh shubhojeet.gh...@yahoo.com: There seems to be an issue with urllib2 The headers defined does not match with the physical data packet (from wireshark). Other header parameters such as User Agent, cookie works fine. Here is an example of a failure: Python Code: import urllib2 url = http://www.python.org; req = urllib2.Request(url) req.add_header('Connection',keep-alive) u = urllib2.urlopen(req) Wireshark: GET / HTTP/1.1 Accept-Encoding: identity Connection: close Host: www.python.org User-Agent: Python-urllib/2.6 -- components: IO messages: 143120 nosy: orsenthil, shubhojeet.ghosh priority: normal severity: normal status: open title: urllib2 headers issue type: behavior versions: Python 2.6 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12849 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12841] Incorrect tarfile.py extraction
STINNER Victor victor.stin...@haypocalc.com added the comment: Should this bug be fixed in 3.3, or 2.7+3.2+3.3? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12841 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12846] unicodedata.normalize turkish letter problem
Terry J. Reedy tjre...@udel.edu added the comment: You are doing two different things to the original string: normalizing and encoding to ascii with errors ignored. Each should be tested separately. On 3.2: import unicodedata s1 = üfürükçü ağaç ve ıslıkçı çeşme s2 = unicodedata.normalize('NFKD', s1) print(s2) print(s2.encode('ascii','ignore')) #prints üfürükçü ağaç ve ıslıkçı çeşme b'ufurukcu agac ve slkc cesme' The dotless i (== '\u0131') in s2 does not encode to ascii and is properly dropped when the error is ignored. I believe you are mistaken to think that unicodedata.normalize *should* turn turkish letter ı == \u131 into i. Unicodedata.decomposition(ı) returns an empty string, as it should (see below) because that character has no decomposition normalization in Unicode 6. So I am closing this issue as invalid. Here is the entry from http://www.unicode.org/Public/6.0.0/ucd/UnicodeData.txt 0131;LATIN SMALL LETTER DOTLESS I;Ll;0;L;N;;;0049;;0049 That is explained here http://www.unicode.org/reports/tr44/tr44-6.html#UnicodeData.txt The blank after 'L' (bidi class - left to right) is for decomposition type and mapping. There is none, so unicodedata.decomposition is correct. The last three entries are for uppercase, lowercase, and titlecase conversions. Those are different from normalizations. To reinforce this, http://www.unicode.org/Public/6.0.0/ucd/NormalizationTest.txt says explicitly @Part1 # Character by character test # All characters not explicitly occurring in c1 of Part 1 have identical NFC, D, KC, KD forms. 'c1' is column 1, starting from 1. In this list, 0130 is followed by 0132, omitting 0131, so the line above applies. After writing this, I discovered that Lib/test/test_normalization.py runs the complete test specified in NormalizationTest.txt for code points that have and do not have normalization forms. Side note Python 2.6 is in security-fix-only mode. -- nosy: +terry.reedy resolution: - invalid status: open - closed versions: +Python 2.7, Python 3.2 -Python 2.6 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12846 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug
Terry J. Reedy tjre...@udel.edu added the comment: But I think we may want to create a new module which provides various APIs specifically for apps that need care when dealing with Unicode. I have started thinking that way too -- perhaps unitools? It could contain the code point iterator for the benefit of other implementations. Actually, since 393 still allows insertion of surrogate values, it might not be completely useless for CPython either. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12729 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation
Tom Christiansen tchr...@perl.com added the comment: Antoine Pitrou rep...@bugs.python.org wrote on Sat, 27 Aug 2011 20:04:56 -: Neither am I. Even in old-style English with ae and oe, one wrote ÆGYPT and ÆSIR all caps but Ægypt and Æsir in titlecase, not *Aegypt or *Aesir. Similarly with ŒNOLOGY / Œnology / œnology, never *Oenology. Trying to disprove you a bit: http://ecx.images-amazon.com/images/I/51G6CH9XFFL._SL500_AA300_.jpg http://ecx.images-amazon.com/images/I/51k7TmosPdL._SL500_AA300_.jpg http://ecx.images-amazon.com/images/I/518UzMeLFCL._SL500_AA300_.jpg but classical typographies seem to write either the uppercase Œ or the lowercase œ. That's what I meant: one only ever sees œufs or ŒUFS, never OEUFS. French doesn't fit into ISO 8859-1. That's one of the changes to ISO-8859-15 compared with ISO-8859-1 (and Unicode): iso-8859-1 A4 ⇔ U+00A4 < ¤ > \N{CURRENCY SIGN} iso-8859-15 A4 ⇒ U+20AC < € > \N{EURO SIGN} iso-8859-1 A6 ⇔ U+00A6 < ¦ > \N{BROKEN BAR} iso-8859-15 A6 ⇒ U+0160 < Š > \N{LATIN CAPITAL LETTER S WITH CARON} iso-8859-1 A8 ⇔ U+00A8 < ¨ > \N{DIAERESIS} iso-8859-15 A8 ⇒ U+0161 < š > \N{LATIN SMALL LETTER S WITH CARON} iso-8859-1 B4 ⇔ U+00B4 < ´ > \N{ACUTE ACCENT} iso-8859-15 B4 ⇒ U+017D < Ž > \N{LATIN CAPITAL LETTER Z WITH CARON} iso-8859-1 B8 ⇔ U+00B8 < ¸ > \N{CEDILLA} iso-8859-15 B8 ⇒ U+017E < ž > \N{LATIN SMALL LETTER Z WITH CARON} iso-8859-1 BC ⇔ U+00BC < ¼ > \N{VULGAR FRACTION ONE QUARTER} iso-8859-15 BC ⇒ U+0152 < Œ > \N{LATIN CAPITAL LIGATURE OE} iso-8859-1 BD ⇔ U+00BD < ½ > \N{VULGAR FRACTION ONE HALF} iso-8859-15 BD ⇒ U+0153 < œ > \N{LATIN SMALL LIGATURE OE} iso-8859-1 BE ⇔ U+00BE < ¾ > \N{VULGAR FRACTION THREE QUARTERS} iso-8859-15 BE ⇒ U+0178 < Ÿ > \N{LATIN CAPITAL LETTER Y WITH DIAERESIS} That said, I wonder why Unicode even includes ligatures like ff. Sounds like mission creep to me (and horrible annoyances for people like us). I'm pretty sure that typographic ligatures are there for roundtripping with legacy encodings. I believe that œ/Œ is the only code point with ligature in its name that you're supposed to still use, and that all others should be figured out by modern fonting software. --tom -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12736 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12839] zlibmodule cannot handle Z_VERSION_ERROR zlib error
Nadeem Vawda nadeem.va...@gmail.com added the comment: Done. Once again, thanks for the report and the patch! -- resolution: - fixed stage: - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12839 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12843] file object read* methods in append mode overflows
Amaury Forgeot d'Arc amaur...@gmail.com added the comment: You should call the .flush() method when switching from writes to reads. Nothing really overflows, but the fread() function may return uninitialized memory. In versions 2.x, python uses the fopen, fread and fwrite function (from the C library) and is subject to their limitations. The exact behaviour is undefined, and it is well possible that it only happens on Windows. See also the discussion in #7952. This issue does not exist in versions 3.x, where file functions have been rewritten. -- nosy: +amaury.forgeotdarc resolution: - invalid status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12843 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12754] Add alternative random number generators
douglas bagnall doug...@paradise.net.nz added the comment: Earlier this year I wrote Python wrappers for a number of generators: https://github.com/douglasbagnall/riffle They are mostly cryptographic stream ciphers from the ESTREAM[1] project, but I was also interested in dSFMT[2], which is a SIMD optimised descendant of MT19937 which runs several times faster and directly produces doubles using cunning bit tricks. [1]http://www.ecrypt.eu.org/stream/ [2]http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/#dSFMT Wrapped in Python, the stream ciphers ran about as fast as MT19937 on my laptop, while dSFMT took about two thirds the time to run a random();random();random();... test. For a slightly more realistic test (sum(random() for x in range(N))), the performance levelled right off. As expected. The stream cipher generators have some good properties. They generally generate random bytes using something analogous to hash('%s%s' % seed, counter), which means different seeds produce well separated streams, and to skip forward or back in the stream, you just adjust the counter. This would allow the reinstatement of Random()'s stream-skipping function, which some people (e.g. L'Ecuyer) think is important. (incidentally, the MT people have come up with a jump-ahead algorithm for MT http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/JUMP/index.html). Of the ciphers I tried, the chacha/salsa family and sosemanuk had the best combination of good testing and portable, reasonably fast, openly licensed C implementations. HC128 and snow2 also perform well. The chacha code is shorter than sosemanuk, so I would choose that. It is used as a primitive in the BLAKE SHA3 candidate, which is a vote of confidence and an attractor of testing for the algorithm. -- nosy: +dbagnall ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12754 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12754] Add alternative random number generators
douglas bagnall doug...@paradise.net.nz added the comment: A bit more on the state size and period of the stream ciphers. Chacha and Salsa use 64 bytes (512 bits) of state (vs ~2.5kB for MT19937). Its counter is 64 bits, and its seed can be 320 bits (in cipher-speak, the seed is split between a 256 bit key and a 64 bit IV). Each counter iteration produces 64 random bytes, or 8 doubles, so for any seed, you get a cycle of 2 ** 67, which would last in the order of 100 thousand years on current PCs. Some of the other ciphers I looked at have smaller seeds and states, and some produce fewer bytes per iteration, but I don't think any of them will result in a cycle of smaller than 2 ** 64. PS: Regarding the discussion of something like Random.getrandbytes(n): +1 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12754 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12846] unicodedata.normalize turkish letter problem
Changes by Ezio Melotti ezio.melo...@gmail.com: -- nosy: +ezio.melotti stage: - committed/rejected ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12846 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12754] Add alternative random number generators
Raymond Hettinger raymond.hettin...@gmail.com added the comment: Thanks Douglas. Can you say what the cryptographic guarantees are for Chacha and Salsa (seeing a stream of randoms doesn't allow you to do deduce internal state, previous randoms, or future randoms)? Is it suitably strong for gaming (dealing poker hands, lottery numbers, etc)? I'm not sure I follow the notes on state size. Is it 320 bits + 64 bits or is it 512 bits? Also, I'm not sure that the smaller state is an advantage that users care about (unless they are pickling many instances of the prngs). It's okay for jumpahead() to reappear in generators that support it, but that method can't be a mandatory part of the Random API because it doesn't make sense for many PRNGs where a jumpahead function isn't known. With respect to the SIMD optimizations and longlong to double operations, I'm curious to take a look at how it was done yet wonder if there is a provable, portable implementation and also wonder if it is worth it (the speed of generating a random() tends to be dwarfed by surrounding code that actually uses the result -- allocating the python object, etc). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12754 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com