[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2011-08-28 Thread Steven D'Aprano

Steven D'Aprano steve+pyt...@pearwood.info added the comment:

I'm not sure if this belongs here, or on the Google code project page, so I'll 
add it in both places :)

Feature request: please change the NEW flag to something else. In five or six 
years (give or take), the re module will be long forgotten, compatibility with 
it will not be needed, so-called new features will no longer be new, and the 
NEW flag will just be silly.

If you care about future compatibility, some sort of version specification 
would be better, e.g. VERSION=0 (current re module), VERSION=1 (this regex 
module), VERSION=2 (next generation). You could then default to VERSION=0 for 
the first few releases, and potentially change to VERSION=1 some time in the 
future.

Otherwise, I suggest swapping the sense of the flag: instead of re behaviour 
unless NEW flag is given, I'd say re behaviour only if OLD flag is given. 
(Old semantics will, of course, remain old even when the new semantics are no 
longer new.)

--
nosy: +stevenjd

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2636
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12731] python lib re uses obsolete sense of \w in full violation of UTS#18 RL1.2a

2011-08-28 Thread Ezio Melotti

Ezio Melotti ezio.melo...@gmail.com added the comment:

 Or the re module should be *replaced* by the code from the regex module
 (but renamed to re, and with certain backwards compatibilities
 restored, probably).

This is what I meant.

 But I really hope the re module (really: the _sre extension module)
 can be fixed.

Start fixing these issues from scratch doesn't make much sense IMHO.  We could 
extract the fixes from regex and merge them in re, but then again it's 
probably easier to just replace the whole module.

 We should also make a habit in our docs of citing specific versions
 of the Unicode standard, and specific TR numbers and versions where 
 they apply.

While this is a good thing it's not always doable.  Usually someone reports a 
bug related to something specified in some standard and only that part gets 
fixed.  Sometimes everything else is also updated to follow the whole standard, 
but often this happens incrementally, so we can't say, e.g., the re module 
supports Unicode x.y unless we go through the whole standard and 
fix/implements everything.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12731
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12731] python lib re uses obsolete sense of \w in full violation of UTS#18 RL1.2a

2011-08-28 Thread Ezio Melotti

Ezio Melotti ezio.melo...@gmail.com added the comment:

 But I really hope the re module (really: the _sre extension module)
 can be fixed.

If you mean on 2.7/3.2, then I guess we could extract the fixes from regex, but 
we have to see if it's doable and someone will have to do it.

Also consider that the regex module is available for 2.7/3.2, so we could 
suggest the users to use it if they have problems with the re bugs (even if 
that means having an additional dependency).

ISTM that current plan is:
  * replace re with regex (and rename it) on 3.3 and fix all these bugs;
  * leave 2.7 and 3.2 with the old re and its bugs;
  * let people use the external regex module on 2.7/3.2 if they need to.

If this is not ok, maybe it should be discussed on python-dev.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12731
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12839] zlibmodule cannot handle Z_VERSION_ERROR zlib error

2011-08-28 Thread Roundup Robot

Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset ba5000307b5d by Nadeem Vawda in branch '2.7':
Issue #12839: Fix crash in zlib module due to version mismatch.
http://hg.python.org/cpython/rev/ba5000307b5d

New changeset cc9e794bf94f by Nadeem Vawda in branch '3.2':
Issue #12839: Fix crash in zlib module due to version mismatch.
http://hg.python.org/cpython/rev/cc9e794bf94f

New changeset b384231df332 by Nadeem Vawda in branch 'default':
Merge: #12839: Fix crash in zlib module due to version mismatch.
http://hg.python.org/cpython/rev/b384231df332

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12839
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12759] (?P=) input for Tools/scripts/redemo.py raises unnhandled exception

2011-08-28 Thread Alexander

Alexander fred...@mail.ru added the comment:

I would like to make a patch.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12759
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12720] Expose linux extended filesystem attributes

2011-08-28 Thread Benjamin Peterson

Benjamin Peterson benja...@python.org added the comment:

And here is the next version, taking into account neologix's review.

--
Added file: http://bugs.python.org/file23056/xattrs.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12720
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12287] ossaudiodev: stack corruption with FD = FD_SETSIZE

2011-08-28 Thread Roundup Robot

Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset ff6adb867f40 by Charles-François Natali in branch '2.7':
Issue #12287: Fix a stack corruption in ossaudiodev module when the FD is
http://hg.python.org/cpython/rev/ff6adb867f40

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12287
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12287] ossaudiodev: stack corruption with FD = FD_SETSIZE

2011-08-28 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

The _socket module doesn't compile anymore on Windows:


Build started: Project: _socket, Configuration: Debug|Win32
Compiling...
socketmodule.c
29..\Modules\socketmodule.c(1649) : warning C4013: '_PyIsSelectable_fd' 
undefined; assuming extern returning int
Linking...
   Creating library 
d:\cygwin\home\db3l\buildarea\2.7.bolen-windows\build\PCbuild\\_socket_d.lib 
and object 
d:\cygwin\home\db3l\buildarea\2.7.bolen-windows\build\PCbuild\\_socket_d.exp
socketmodule.obj : error LNK2019: unresolved external symbol 
__PyIsSelectable_fd referenced in function _sock_accept
d:\cygwin\home\db3l\buildarea\2.7.bolen-windows\build\PCbuild\\_socket_d.pyd : 
fatal error LNK1120: 1 unresolved externals
Build log was saved at 
file://d:\cygwin\home\db3l\buildarea\2.7.bolen-windows\build\PCbuild\Win32-temp-Debug\_socket\BuildLog.htm
_socket - 2 error(s), 1 warning(s)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12287
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12287] ossaudiodev: stack corruption with FD = FD_SETSIZE

2011-08-28 Thread Charles-François Natali

Charles-François Natali neolo...@free.fr added the comment:

 STINNER Victor victor.stin...@haypocalc.com added the comment:

 The _socket module doesn't compile anymore on Windows:


Fixed (that's why I wanted a Windows expert to have a look at this patch :-).

 You might replace #if defined(_MSC_VER) with #if defined
 (MS_WINDOWS), but in another commit.

I'd rather not modify code I don't understand. Plus, I have a really
poor Windows karma...

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12287
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12837] Patch for issue #12810 removed a valid check on socket ancillary data

2011-08-28 Thread Charles-François Natali

Charles-François Natali neolo...@free.fr added the comment:

 That has since been changed.  I'm reading from POSIX.1-2008,
 which says:

I see.

 The warning against using values larger than 2**32 - 1 is still
 there, I presume because they would not fit in a 32-bit signed
 int.

I assume you mean 2**31 - 1.

 I take it you mean CMSG_FIRSTHDR here

Indeed.

 IIRC, I saw an implementation in old FreeBSD headers that did not
 check msg_controllen, and hence did not return NULL as RFC 3542
 requires.

Alright, that's all I wanted to know.

 That said, the fact remains that the compiler warning is spurious
 if msg_controllen can be signed on some systems, and I still
 don't think decreasing the robustness of the code (particularly
 against any future modifications to that code) just for the sake
 of silencing a spurious warning is a good thing to do.  People
 can read the comment above the offending line and see that the
 compiler has got it wrong.

Well, the compiler does not get it wrong. If socklen_t is defined as
an unsigned int, it has no way of knowing that it might be defined as
signed int on other platforms.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12837
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12841] Incorrect tarfile.py extraction

2011-08-28 Thread Lars Gustäbel

Lars Gustäbel l...@gustaebel.de added the comment:

The patch is fine. Thank you very much for it, Sebastien. I think we have to go 
without a unit test.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12841
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12287] ossaudiodev: stack corruption with FD = FD_SETSIZE

2011-08-28 Thread Roundup Robot

Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset 852ca32eb18d by Charles-François Natali in branch '3.2':
Issue #12287: Fix a stack corruption in ossaudiodev module when the FD is
http://hg.python.org/cpython/rev/852ca32eb18d

New changeset ad1c09b6a5b9 by Charles-François Natali in branch 'default':
Issue #12287: Fix a stack corruption in ossaudiodev module when the FD is
http://hg.python.org/cpython/rev/ad1c09b6a5b9

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12287
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12837] Patch for issue #12810 removed a valid check on socket ancillary data

2011-08-28 Thread Roundup Robot

Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset 3ed2d087e70d by Charles-François Natali in branch 'default':
Issue #12837: POSIX.1-2008 allows socklen_t to be a signed integer: re-enable
http://hg.python.org/cpython/rev/3ed2d087e70d

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12837
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12837] Patch for issue #12810 removed a valid check on socket ancillary data

2011-08-28 Thread Charles-François Natali

Charles-François Natali neolo...@free.fr added the comment:

Thanks for the patch.

For the record, here's Linus Torvalds' opinion on this whole socklen_t 
confusion:

_Any_ sane library _must_ have socklen_t be the same size as int.  Anything 
else breaks any BSD socket layer stuff.  POSIX initially did make it a size_t, 
and I (and
   hopefully others, but obviously not too many) complained to them very 
loudly indeed.  Making it a size_t is completely broken, exactly because size_t 
very seldom is the
   same  size  as  int on 64-bit architectures, for example.  And it has 
to be the same size as int because that's what the BSD socket interface is.  
Anyway, the POSIX
   people eventually got a clue, and created socklen_t.  They shouldn't 
have touched it in the first place, but once they did they felt it had to have 
a named  type  for
   some unfathomable reason (probably somebody didn't like losing face over 
having done the original stupid thing, so they silently just renamed their 
blunder).


--
resolution:  - fixed
stage:  - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12837
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12720] Expose linux extended filesystem attributes

2011-08-28 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

Is it normal that listxattr() succeeds but getxattr() fails with ENOTSUPP?

 os.listxattr(/)
[]
 os.getxattr(/, foo)
Traceback (most recent call last):
  File stdin, line 1, in module
OSError: [Errno 95] Operation not supported

This is on 2.6.38.8.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12720
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8426] multiprocessing.Queue fails to get() very large objects

2011-08-28 Thread Charles-François Natali

Changes by Charles-François Natali neolo...@free.fr:


--
components: +Documentation -Library (Lib)
nosy: +docs@python
priority: normal - low

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8426
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12537] mailbox's _become_message is very fragile

2011-08-28 Thread Kasun Herath

Changes by Kasun Herath kasun...@gmail.com:


--
nosy: +kasun

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12537
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12720] Expose linux extended filesystem attributes

2011-08-28 Thread Benjamin Peterson

Benjamin Peterson benja...@python.org added the comment:

2011/8/28 Antoine Pitrou rep...@bugs.python.org:

 Antoine Pitrou pit...@free.fr added the comment:

 Is it normal that listxattr() succeeds but getxattr() fails with ENOTSUPP?

 os.listxattr(/)
 []
 os.getxattr(/, foo)
 Traceback (most recent call last):
  File stdin, line 1, in module
 OSError: [Errno 95] Operation not supported

The reason you're getting ENOSUP is you have to use the proper prefix.
user.* for example.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12720
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12720] Expose linux extended filesystem attributes

2011-08-28 Thread Benjamin Peterson

Benjamin Peterson benja...@python.org added the comment:

After Antoine's review...

--
Added file: http://bugs.python.org/file23057/xattrs.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12720
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11969] Can't launch multiproccessing.Process on methods

2011-08-28 Thread terry.h

Changes by terry.h terry.her...@gmail.com:


--
nosy: +terry.h

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11969
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4028] Problem compiling the multiprocessing module on sunos5

2011-08-28 Thread Charles-François Natali

Charles-François Natali neolo...@free.fr added the comment:

Hello,

 there's some issues compiling the multiprocessing module on the SunOS
 I have here, where CMSG_LEN, CMSG_ALIGN, CMSG_SPACE and sem_timedwait
 are absent.

CMSG_LEN and friends should be defined by sys/socket.h (as required by 
POSIX). SunOS 5.10 man page lists them:

http://download.oracle.com/docs/cd/E19253-01/816-5173/socket.h-3head/index.html

But not the SunOS 5.9 version:
http://ewp.rpi.edu/hartford/webgen/sysdocs/C/solaris_9/SUNWaman/hman3head/socket.3head.html

 it looks like simply defining the first three macros like this works

It works, but it's probably not a good idea: if the headers don't define 
CMSG_LEN and friends, then FD passing will probably not work.
It'd be better to not compile multiprocessing_(sendfd|recvfd) if CMSG_LEN is 
not defined (patch attached).

 sem_timedwait are absent.

Hmmm.
Do you have the compilation's log?
Normally, if sem_timedwait isn't available, HAVE_SEM_TIMEDWAIT shouldn't be 
defined, and we should fallback to sem_trywait (by the way, calling sem_trywait 
multiple times until the timeout expires is not the same has calling 
sem_timedwait: this will fail in case of heavy contention).
So this should build correctly.

And this seems even stranger when I read Sebastian's message:

so I had to commented out HAVE_SEM_TIMEDWAIT from setup.py, see:
 elif platform.startswith('sunos5'):
  macros = dict(
  HAVE_SEM_OPEN=1,
  HAVE_FD_TRANSFER=1
  )
  #HAVE_SEM_TIMEDWAIT=0,
  libraries = ['rt']


Makes sense.
If we define HAVE_SEMTIMEDWAIT=0, then code guarded by
#ifdef HAVE_SEMTIMEDWAIT

will be compiled, and the linker won't be able to resolve sem_timedwait.
The preprocessor just checks that the symbol is defined, not that it has a 
non-zero value.
To sum up: could someone with a SunOS box test the attached patch, and post the 
compilation logs if it still fails?

--
keywords: +patch
nosy: +neologix
Added file: http://bugs.python.org/file23058/multiprocessing_sendfd.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4028
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12287] ossaudiodev: stack corruption with FD = FD_SETSIZE

2011-08-28 Thread Charles-François Natali

Charles-François Natali neolo...@free.fr added the comment:

Alright, committed to 2.7, 3.2 an default.
Seems to work on all the buildbots, closing.

--
resolution:  - fixed
stage: patch review - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12287
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12731] python lib re uses obsolete sense of \w in full violation of UTS#18 RL1.2a

2011-08-28 Thread Guido van Rossum

Guido van Rossum gu...@python.org added the comment:

[me]
 But I really hope the re module (really: the _sre extension module)
 can be fixed.

[Ezio]
 Start fixing these issues from scratch doesn't make much sense IMHO.  We 
 could extract the fixes from regex and merge them in re, but then again 
 it's probably easier to just replace the whole module.

I have changed my mind at least half-way. I am open to having regex
(with some changes, details TBD) replace re in 3.3. (I am not yet 100%
convinced, but I'm not rejecting it as strongly as I was when I wrote
that comment in this bug. See the ongoing python-dev discussion on
this topic.)

 We should also make a habit in our docs of citing specific versions
 of the Unicode standard, and specific TR numbers and versions where
 they apply.

 While this is a good thing it's not always doable.  Usually someone reports a 
 bug related to something specified in some standard and only that part gets 
 fixed.  Sometimes everything else is also updated to follow the whole 
 standard, but often this happens incrementally, so we can't say, e.g., the 
 re module supports Unicode x.y unless we go through the whole standard and 
 fix/implements everything.

Hm. I think that for Unicode it may actually be important enough to be
consistent in following the whole standard that we should attempt to
be consistent and not just chase bug reports. Now, we may consciously
decide not to implement a certain recommendation of the standard. E.g.
I'm not going to require that IronPython or Jython have string objects
that support O(1) indexing of code points, even (assuming PEP 393 gets
accepted) CPython will have them. But these decisions should be made
explicitly, and documented clearly.

Ideally, we need a Unicode czar -- a core developer whose job it is
to keep track of Python's compliance with various parts and versions
of the Unicode standard and who can nudge other developers towards
fixing bugs or implementing features, or update the documentation in
case things don't get added. (I like Tom's approach to Java 1.7, where
he submitted proposed doc fixes explaining the deviations from the
standard. Perhaps a bit passive-aggressive, but it was effective. :-)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12731
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation

2011-08-28 Thread Guido van Rossum

Guido van Rossum gu...@python.org added the comment:

Thanks Tom for such a clear explanation! I hope someone will implement
this. (Matthew, does this affect regex? I am guessing it does, for
case-insensitive matching?)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12736
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug

2011-08-28 Thread Guido van Rossum

Guido van Rossum gu...@python.org added the comment:

 PEP-393 will take care of iterating by code points.

Only for CPython. IronPython/Jython will still need a separate solution.

 Where would you have other iterators go? The string module?
 Something else I have not thought of? Or something new?

Undecided. But I think we may want to create a new module which
provides various APIs specifically for apps that need care when
dealing with Unicode.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12729
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12797] io.FileIO and io.open should support openat

2011-08-28 Thread Terry J. Reedy

Terry J. Reedy tjre...@udel.edu added the comment:

I prefer a new parameter either at the end of the arglist or possibly keyword 
only. The idea for both variations is to let typical users ignore the option, 
which would be hard to do if it is part of the prime parameter. The idea for 
keyword only is that we might want to add other rarely used but useful options. 
They have no natural order, and having say, 8 positional params is pretty 
wretched. (I have worked with such APIs.)

--
nosy: +terry.reedy

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12797
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12731] python lib re uses obsolete sense of \w in full violation of UTS#18 RL1.2a

2011-08-28 Thread Ezio Melotti

Ezio Melotti ezio.melo...@gmail.com added the comment:

 Ideally, we need a Unicode czar -- a core developer whose job it is
 to keep track of Python's compliance with various parts and versions
 of the Unicode standard and who can nudge other developers towards
 fixing bugs or implementing features, or update the documentation in
 case things don't get added.

We should first do a full review of the latest Unicode standard and see what's 
missing.  I think there might be parts of older Unicode versions (even  
Unicode 5) that are not yet implemented.  Chapter 3 is a good place where to 
start, but I'm not sure that's enough -- there are a few TRs that should be 
considered as well.
If we manage to catch up with Unicode 6, then it shouldn't be too difficult to 
review the changes that every new version will introduce and open an issue for 
each (or a single issue if the changes are limited).
FWIW I'm planning to look at the conformance of the UTF codecs and fix them (if 
necessary) whenever I'll have time.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12731
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12805] Optimizations for bytes.join() et. al

2011-08-28 Thread Terry J. Reedy

Changes by Terry J. Reedy tjre...@udel.edu:


--
nosy: +terry.reedy

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12805
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12815] Coverage of smtpd.py

2011-08-28 Thread Terry J. Reedy

Changes by Terry J. Reedy tjre...@udel.edu:


--
components: +Library (Lib), Tests
versions: +Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12815
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12814] Possible intermittent bug in test_array

2011-08-28 Thread Terry J. Reedy

Terry J. Reedy tjre...@udel.edu added the comment:

Which Python version? 3.3?

--
components: +Library (Lib), Tests
nosy: +terry.reedy

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12814
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12808] Coverage of codecs.py

2011-08-28 Thread Terry J. Reedy

Changes by Terry J. Reedy tjre...@udel.edu:


--
components: +Library (Lib), Tests
versions: +Python 3.3 -Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12808
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12816] smtpd uses library outside of the standard libraries

2011-08-28 Thread Terry J. Reedy

Changes by Terry J. Reedy tjre...@udel.edu:


--
nosy: +terry.reedy

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12816
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12829] pyexpat segmentation fault caused by multiple calls to Parse()

2011-08-28 Thread Terry J. Reedy

Terry J. Reedy tjre...@udel.edu added the comment:

A note for anyone else: David is actually using the xml.parsers.expat module, 
which uses the now undocumented pyexpat module, whose direct use is deprecated.

David: Have you tested with 3.1 or 3.2? (I am about to try on Windows ;-).

--
nosy: +terry.reedy

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12829
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12829] pyexpat segmentation fault caused by multiple calls to Parse()

2011-08-28 Thread Terry J. Reedy

Terry J. Reedy tjre...@udel.edu added the comment:

Running with IDLE on Windows, I get no crash or uncaught exception but got 
these printed lines:

An error occurred during XML parsing.  Error ID: 9.  Error message: junk after 
document element
Line number: 1
An error occurred during XML parsing.  Error ID: 9.  Error message: junk after 
document element
Line number: 1
An error occurred during XML parsing.  Error ID: 9.  Error message: junk after 
document element
An error occurred during XML parsing.  Error ID: 9.  Error message: junk after 
document element
Line number: 1
An error occurred during XML parsing.  Error ID: 9.  Error message: junk after 
document element
Line number: 1
An error occurred during XML parsing.  Error ID: 9.  Error message: junk after 
document element

Is this the correct, expected output?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12829
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12836] ctypes.cast() creates circular reference in original object

2011-08-28 Thread Terry J. Reedy

Terry J. Reedy tjre...@udel.edu added the comment:

What action are you suggesting? Change ctypes code or its doc or something 
else. If the doc, please suggest a specific change.

Can you test on 3.x?

--
nosy: +terry.reedy
title: cast() creates circular reference in original object - ctypes.cast() 
creates circular reference in original object

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12836
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12843] file object read* methods in append mode overflows

2011-08-28 Thread Terry J. Reedy

Terry J. Reedy tjre...@udel.edu added the comment:

 I have confirmed that this only happens in windows.

This would literally mean that you tested on several other systems. Did you 
actually mean 'I have only confirmed that this happens in Windows., that you 
only tested on Windows?

The 2.6 series is in security-fix only mode.

--
nosy: +terry.reedy
versions:  -Python 2.6

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12843
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation

2011-08-28 Thread Matthew Barnett

Matthew Barnett pyt...@mrabarnett.plus.com added the comment:

The regex module currently uses simple case-folding, although I'm working 
towards full case-folding, as listed in 
http://www.unicode.org/Public/UNIDATA/CaseFolding.txt.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12736
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12849] urllib2 headers issue

2011-08-28 Thread Shubhojeet Ghosh

New submission from Shubhojeet Ghosh shubhojeet.gh...@yahoo.com:

There seems to be an issue with urllib2
The headers defined does not match with the physical data packet (from 
wireshark). Other header parameters such as User Agent, cookie works fine.
Here is an example of a failure:

Python Code:
import urllib2

url = http://www.python.org;

req = urllib2.Request(url)
req.add_header('Connection',keep-alive)
u = urllib2.urlopen(req)


Wireshark:
GET / HTTP/1.1

Accept-Encoding: identity

Connection: close

Host: www.python.org

User-Agent: Python-urllib/2.6

--
components: IO
messages: 143120
nosy: orsenthil, shubhojeet.ghosh
priority: normal
severity: normal
status: open
title: urllib2 headers issue
type: behavior
versions: Python 2.6

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12849
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12841] Incorrect tarfile.py extraction

2011-08-28 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

Should this bug be fixed in 3.3, or 2.7+3.2+3.3?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12841
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12846] unicodedata.normalize turkish letter problem

2011-08-28 Thread Terry J. Reedy

Terry J. Reedy tjre...@udel.edu added the comment:

You are doing two different things to the original string: normalizing and 
encoding to ascii with errors ignored. Each should be tested separately.
On 3.2:
import unicodedata
s1 = üfürükçü ağaç ve ıslıkçı çeşme
s2 =  unicodedata.normalize('NFKD', s1)
print(s2)
print(s2.encode('ascii','ignore'))

#prints
üfürükçü ağaç ve ıslıkçı çeşme
b'ufurukcu agac ve slkc cesme'

The dotless i (==  '\u0131') in s2 does not encode to ascii and is properly 
dropped when the error is ignored.

I believe you are mistaken to think that unicodedata.normalize *should* turn 
turkish letter ı == \u131 into i. Unicodedata.decomposition(ı) returns 
an empty string, as it should (see below) because that character has no 
decomposition normalization in Unicode 6. So I am closing this issue as invalid.

Here is the entry from
http://www.unicode.org/Public/6.0.0/ucd/UnicodeData.txt
0131;LATIN SMALL LETTER DOTLESS I;Ll;0;L;N;;;0049;;0049
That is explained here
http://www.unicode.org/reports/tr44/tr44-6.html#UnicodeData.txt
The blank after 'L' (bidi class - left to right) is for decomposition type and 
mapping. There is none, so unicodedata.decomposition is correct. The last three 
entries are for uppercase, lowercase, and titlecase conversions. Those are 
different from normalizations.

To reinforce this,
http://www.unicode.org/Public/6.0.0/ucd/NormalizationTest.txt
says explicitly
@Part1 # Character by character test
# All characters not explicitly occurring in c1 of Part 1 have identical NFC, 
D, KC, KD forms.
'c1' is column 1, starting from 1.
In this list, 0130 is followed by 0132, omitting 0131, so the line above 
applies.

After writing this, I discovered that Lib/test/test_normalization.py runs the 
complete test specified in NormalizationTest.txt for code points that have and 
do not have normalization forms.

Side note Python 2.6 is in security-fix-only mode.

--
nosy: +terry.reedy
resolution:  - invalid
status: open - closed
versions: +Python 2.7, Python 3.2 -Python 2.6

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12846
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug

2011-08-28 Thread Terry J. Reedy

Terry J. Reedy tjre...@udel.edu added the comment:

 But I think we may want to create a new module which
provides various APIs specifically for apps that need care when
dealing with Unicode.

I have started thinking that way too -- perhaps unitools?
It could contain the code point iterator for the benefit of other 
implementations. Actually, since 393 still allows insertion of surrogate 
values, it might not be completely useless for CPython either.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12729
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation

2011-08-28 Thread Tom Christiansen

Tom Christiansen tchr...@perl.com added the comment:

Antoine Pitrou rep...@bugs.python.org wrote on Sat, 27 Aug 2011 20:04:56 
-: 

 Neither am I.  Even in old-style English with ae and oe, one wrote
 ÆGYPT and ÆSIR all caps but Ægypt and Æsir in titlecase, not *Aegypt or
 *Aesir.  Similarly with ŒNOLOGY / Œnology / œnology, never *Oenology.

 Trying to disprove you a bit:
 http://ecx.images-amazon.com/images/I/51G6CH9XFFL._SL500_AA300_.jpg
 http://ecx.images-amazon.com/images/I/51k7TmosPdL._SL500_AA300_.jpg
 http://ecx.images-amazon.com/images/I/518UzMeLFCL._SL500_AA300_.jpg

 but classical typographies seem to write either the uppercase Πor the
 lowercase œ.

That's what I meant: one only ever sees œufs or ŒUFS, never OEUFS.
French doesn't fit into ISO 8859-1.  That's one of the changes to
ISO-8859-15 compared with ISO-8859-1 (and Unicode):

iso-8859-1   A4  ⇔  U+00A4  < ¤ >  \N{CURRENCY SIGN}
iso-8859-15  A4  ⇒  U+20AC  < € >  \N{EURO SIGN}

iso-8859-1   A6  ⇔  U+00A6  < ¦ >  \N{BROKEN BAR}
iso-8859-15  A6  ⇒  U+0160  < Š >  \N{LATIN CAPITAL LETTER S WITH CARON}

iso-8859-1   A8  ⇔  U+00A8  < ¨ >  \N{DIAERESIS}
iso-8859-15  A8  ⇒  U+0161  < š >  \N{LATIN SMALL LETTER S WITH CARON}

iso-8859-1   B4  ⇔  U+00B4  < ´ >  \N{ACUTE ACCENT}
iso-8859-15  B4  ⇒  U+017D  < Ž >  \N{LATIN CAPITAL LETTER Z WITH CARON}

iso-8859-1   B8  ⇔  U+00B8  < ¸ >  \N{CEDILLA}
iso-8859-15  B8  ⇒  U+017E  < ž >  \N{LATIN SMALL LETTER Z WITH CARON}

iso-8859-1   BC  ⇔  U+00BC  < ¼ >  \N{VULGAR FRACTION ONE QUARTER}
iso-8859-15  BC  ⇒  U+0152  < Œ >  \N{LATIN CAPITAL LIGATURE OE}

iso-8859-1   BD  ⇔  U+00BD  < ½ >  \N{VULGAR FRACTION ONE HALF}
iso-8859-15  BD  ⇒  U+0153  < œ >  \N{LATIN SMALL LIGATURE OE}

iso-8859-1   BE  ⇔  U+00BE  < ¾ >  \N{VULGAR FRACTION THREE QUARTERS}
iso-8859-15  BE  ⇒  U+0178  < Ÿ >  \N{LATIN CAPITAL LETTER Y WITH DIAERESIS}

 That said, I wonder why Unicode even includes ligatures like ff. Sounds
 like mission creep to me (and horrible annoyances for people like us).

I'm pretty sure that typographic ligatures are there for roundtripping
with legacy encodings.  I believe that œ/Œ is the only code point
with ligature in its name that you're supposed to still use, and
that all others should be figured out by modern fonting software.

--tom

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12736
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12839] zlibmodule cannot handle Z_VERSION_ERROR zlib error

2011-08-28 Thread Nadeem Vawda

Nadeem Vawda nadeem.va...@gmail.com added the comment:

Done. Once again, thanks for the report and the patch!

--
resolution:  - fixed
stage:  - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12839
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12843] file object read* methods in append mode overflows

2011-08-28 Thread Amaury Forgeot d'Arc

Amaury Forgeot d'Arc amaur...@gmail.com added the comment:

You should call the .flush() method when switching from writes to reads.

Nothing really overflows, but the fread() function may return uninitialized 
memory.  In versions 2.x, python uses the fopen, fread and fwrite function 
(from the C library) and is subject to their limitations.

The exact behaviour is undefined, and it is well possible that it only happens 
on Windows.  See also the discussion in #7952.

This issue does not exist in versions 3.x, where file functions have been 
rewritten.

--
nosy: +amaury.forgeotdarc
resolution:  - invalid
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12843
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12754] Add alternative random number generators

2011-08-28 Thread douglas bagnall

douglas bagnall doug...@paradise.net.nz added the comment:

Earlier this year I wrote Python wrappers for a number of generators:

https://github.com/douglasbagnall/riffle

They are mostly cryptographic stream ciphers from the ESTREAM[1] project, but I 
was also interested in dSFMT[2], which is a SIMD optimised descendant of 
MT19937 which runs several times faster and directly produces doubles using 
cunning bit tricks.

[1]http://www.ecrypt.eu.org/stream/
[2]http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/#dSFMT

Wrapped in Python, the stream ciphers ran about as fast as MT19937 on my 
laptop, while dSFMT took about two thirds the time to run a 
random();random();random();... test.  For a slightly more realistic test 
(sum(random() for x in range(N))), the performance levelled right off.  As 
expected.

The stream cipher generators have some good properties.  They generally 
generate random bytes using something analogous to hash('%s%s' % seed, 
counter), which means different seeds produce well separated streams, and to 
skip forward or back in the stream, you just adjust the counter.  This would 
allow the reinstatement of Random()'s stream-skipping function, which some 
people (e.g. L'Ecuyer) think is important. (incidentally, the MT people have 
come up with a jump-ahead algorithm for MT 
http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/JUMP/index.html).

Of the ciphers I tried, the chacha/salsa family and sosemanuk had the best 
combination of good testing and portable, reasonably fast, openly licensed C 
implementations.  HC128 and snow2 also perform well.

The chacha code is shorter than sosemanuk, so I would choose that.  It is used 
as a primitive in the BLAKE SHA3 candidate, which is a vote of confidence and 
an attractor of testing for the algorithm.

--
nosy: +dbagnall

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12754
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12754] Add alternative random number generators

2011-08-28 Thread douglas bagnall

douglas bagnall doug...@paradise.net.nz added the comment:

A bit more on the state size and period of the stream ciphers.

Chacha and Salsa use 64 bytes (512 bits) of state (vs ~2.5kB for MT19937).

Its counter is 64 bits, and its seed can be 320 bits (in cipher-speak, the seed 
is split between a 256 bit key and a 64 bit IV).

Each counter iteration produces 64 random bytes, or 8 doubles, so for any seed, 
you get a cycle of 2 ** 67, which would last in the order of 100 thousand years 
on current PCs.

Some of the other ciphers I looked at have smaller seeds and states, and some 
produce fewer bytes per iteration, but I don't think any of them will result in 
a cycle of smaller than 2 ** 64.

PS: Regarding the discussion of something like Random.getrandbytes(n): +1

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12754
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12846] unicodedata.normalize turkish letter problem

2011-08-28 Thread Ezio Melotti

Changes by Ezio Melotti ezio.melo...@gmail.com:


--
nosy: +ezio.melotti
stage:  - committed/rejected

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12846
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12754] Add alternative random number generators

2011-08-28 Thread Raymond Hettinger

Raymond Hettinger raymond.hettin...@gmail.com added the comment:

Thanks Douglas.   Can you say what the cryptographic guarantees are for Chacha 
and Salsa (seeing a stream of randoms doesn't allow you to do deduce internal 
state, previous randoms, or future randoms)?  Is it suitably strong for gaming 
(dealing poker hands, lottery numbers, etc)?

I'm not sure I follow the notes on state size.  Is it 320 bits + 64 bits or is 
it 512 bits?  Also, I'm not sure that the smaller state is an advantage that 
users care about (unless they are pickling many instances of the prngs).

It's okay for jumpahead() to reappear in generators that support it, but   that 
method can't be a mandatory part of the Random API because it doesn't make 
sense for many PRNGs where a jumpahead function isn't known.

With respect to the SIMD optimizations and longlong to double operations, I'm 
curious to take a look at how it was done yet wonder if there is a provable, 
portable implementation and also wonder if it is worth it (the speed of 
generating a random() tends to be dwarfed by surrounding code that actually 
uses the result -- allocating the python object, etc).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12754
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com