Marc-Andre Lemburg m...@egenix.com added the comment:
Alexander Belopolsky wrote:
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
In issue11303.diff, I add similar optimization for encode('latin1') and for
'utf8' variant of utf-8. I don't think dash-less
Marc-Andre Lemburg m...@egenix.com added the comment:
Alexander Belopolsky wrote:
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
What is the status of this. Status=open and Resolution=rejected contradict
each other.
Sorry, forgot to close the ticket.
Changes by Marc-Andre Lemburg m...@egenix.com:
--
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5902
___
___
Marc-Andre Lemburg m...@egenix.com added the comment:
Alexander Belopolsky wrote:
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
Accepting all common forms for
encoding names means that you can usually give Python an encoding name
from, e.g. a HTML page, or any
Stefan Krah stefan-use...@bytereef.org added the comment:
Python works fine with Notepad generated scripts. I think this is a
CGI issue. Try following this tutorial:
http://www.imladris.com/Scripts/PythonForWindows.html
If you still suspect a bug, you should provide the exact CGI script
and
Marc-Andre Lemburg m...@egenix.com added the comment:
Alexander Belopolsky wrote:
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
Ezio and I discussed on IRC the implementation of alias lookup and neither of
us was able to point out to the function that strips
Raymond Hettinger rhettin...@users.sourceforge.net added the comment:
Okay fixed. The rsplit() method was mentioned in both underlying tracker
issues, so it got mentioned twice when once would have been enough :-)
--
assignee: docs@python - rhettinger
nosy: +rhettinger
priority:
Steffen Daode Nurpmeso sdao...@googlemail.com added the comment:
I wonder what this normalize_encoding() does! Here is a pretty standard
version of mine which is a bit more expensive but catches match more cases!
This is stripped, of course, and can be rewritten very easily to Python's needs
david db.pub.m...@gmail.com added the comment:
This may be stupid but...
shouldn't the example be:
lynx http://localhost:8000/../../../../../etc/passwd
... which does _not_ work.
--
nosy: +db
___
Python tracker rep...@bugs.python.org
Steffen Daode Nurpmeso sdao...@googlemail.com added the comment:
(That is to say, i would do it. But not if _cpython is thrown to trash ,-);
i.e. not if there is not a slight chance that it gets actually patched in
because this performance issue probably doesn't mean a thing in real life.
New submission from Niko Matsakis n...@alum.mit.edu:
Executing code like this:
r = re.compile(r'(\w+)*=.*')
r.match(abcdefghijklmnopqrstuvwxyz)
takes a long time (around 12 seconds, on my machine). Presumably this is
because it is enumerating all the various ways to divvy up the alphabet
Graham Horler tryexc...@gmail.com added the comment:
Are we sure this is dead code, and not just out of date?
e.g. this works, and I use it in production with if Tkinter.TkVersion = 8.4:
b = Tkinter.Button(root)
b.tk.call('tk::ButtonEnter', b._w)
--
nosy: +pysquared
New submission from SilentGhost ghost@gmail.com:
There is an extraneous entry in sidebar of the www.python.org
It has some two chinese characters and leads to download page.
--
messages: 129264
nosy: SilentGhost
priority: normal
severity: normal
status: open
title: extraneous link
SilentGhost ghost@gmail.com added the comment:
Sorry, I realise that this is my mistake.
--
resolution: - invalid
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11308
New submission from Дилян Палаузов dilyan.palau...@aegee.org:
As of python 2.7.1 configured with --enable-ipv6 --enable-unicode
--with-system-expat --with-system-ffi --with-signal-module --with-threads
--with-wctype-functions --enable-shared:
Please #include wctype.h in
Marc-Andre Lemburg m...@egenix.com added the comment:
Дилян Палаузов wrote:
New submission from Дилян Палаузов dilyan.palau...@aegee.org:
As of python 2.7.1 configured with --enable-ipv6 --enable-unicode
--with-system-expat --with-system-ffi --with-signal-module --with-threads
R. David Murray rdmur...@bitdance.com added the comment:
Creating a test for this may not be practical :(
--
assignee: - r.david.murray
nosy: +r.david.murray
stage: - needs patch
___
Python tracker rep...@bugs.python.org
Changes by Jonas H. jo...@lophus.org:
Added file:
http://bugs.python.org/file20874/faster-find-library1-py3k-with-escaped-name.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11258
___
Jonathan Hayward jonathan.hayw...@pobox.com added the comment:
Thank you; noted. I'm closing the bug for now at least; I'll reopen it if need
be.
--
resolution: - invalid
status: open - closed
___
Python tracker rep...@bugs.python.org
Ezio Melotti ezio.melo...@gmail.com added the comment:
See also discussion on #5902.
Steffen, your normalization function looks similar to
encodings.normalize_encoding, with just a few differences (it uses spaces
instead of dashes, it divides alpha chars from digits).
If it doesn't slow down
Changes by STINNER Victor victor.stin...@haypocalc.com:
--
nosy: +haypo
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11303
___
___
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
On Thu, Feb 24, 2011 at 10:30 AM, Ezio Melotti rep...@bugs.python.org wrote:
..
See also discussion on #5902.
Mark has closed #5902 and indeed the discussion of how to efficiently
normalize encoding names (without
Steffen Daode Nurpmeso sdao...@googlemail.com added the comment:
.. i don't have actually invented this algorithm (but don't ask me where i got
the idea from years ago), i've just implemented the function you see. The
algorithm itself avoids some pitfalls in respect to combining numerics and
Steffen Daode Nurpmeso sdao...@googlemail.com added the comment:
(Everything else is beyond my scope. But normalizing _ to - is possibly a bad
idea as far as i can remember the situation three years ago.)
--
___
Python tracker
Steffen Daode Nurpmeso sdao...@googlemail.com added the comment:
P.P.S.: separating alphanumerics is a win for things like, e.g. UTF-16BE: it
gets 'utf 16 be' - think about the possible mispellings here and you see this
algorithm is a good thing
--
Changes by Ralf Schmitt sch...@gmail.com:
--
nosy: +schmir
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8036
___
___
Python-bugs-list mailing
Marc-Andre Lemburg m...@egenix.com added the comment:
Alexander Belopolsky wrote:
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
On Thu, Feb 24, 2011 at 10:30 AM, Ezio Melotti rep...@bugs.python.org wrote:
..
See also discussion on #5902.
Mark has closed
Suresh Kalkunte sskalku...@gmail.com added the comment:
Thanks for the education (hopefully a slight detour for you 8-). I included '/'
to convey uniform behavior across platforms.
I will take it that the difference in what os.path.split() returns on Win32 vs.
Linux is not a bug in Python
Steffen Daode Nurpmeso sdao...@googlemail.com added the comment:
So, well, a-ha, i will boot my laptop this evening and (try to) write a patch
for normalize_encoding(), which will match the standart conforming LATIN1 and
also will continue to support the illegal latin-1 without actually
Ezio Melotti ezio.melo...@gmail.com added the comment:
If the first normalization function is flexible enough to match most of the
spellings of the optimized encodings, they will all benefit of the optimization
without having to go through the long path.
(If the normalized encoding name is
STINNER Victor victor.stin...@haypocalc.com added the comment:
I think that the normalization function in unicodeobject.c (only used for
internal functions) can skip any character different than a-z, A-Z and 0-9.
Something like:
import re
def normalize(name): return re.sub([^a-z0-9], ,
STINNER Victor victor.stin...@haypocalc.com added the comment:
Patch implementing my suggestion.
--
Added file: http://bugs.python.org/file20875/aggressive_normalization.patch
___
Python tracker rep...@bugs.python.org
Ezio Melotti ezio.melo...@gmail.com added the comment:
That will also accept invalid names like 'iso88591' that are not valid now,
'iso 8859 1' is already accepted.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11303
Changes by STINNER Victor victor.stin...@haypocalc.com:
Removed file: http://bugs.python.org/file20875/aggressive_normalization.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11303
___
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
On Thu, Feb 24, 2011 at 11:01 AM, Marc-Andre Lemburg
rep...@bugs.python.org wrote:
..
On this ticker, we're discussing just one application area: that
of the builtin short cuts.
Fair enough. I was hoping to close this
Marc-Andre Lemburg m...@egenix.com added the comment:
As promised, here's the list of places where the wrong Latin-1 encoding
spelling is used:
Lib//test/test_cmd_line.py:
-- for encoding in ('ascii', 'latin1', 'utf8'):
Lib//test/test_codecs.py:
-- ef = codecs.EncodedFile(f,
Marc-Andre Lemburg m...@egenix.com added the comment:
STINNER Victor wrote:
STINNER Victor victor.stin...@haypocalc.com added the comment:
I think that the normalization function in unicodeobject.c (only used for
internal functions) can skip any character different than a-z, A-Z and 0-9.
Marc-Andre Lemburg m...@egenix.com added the comment:
Alexander Belopolsky wrote:
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
On Thu, Feb 24, 2011 at 11:01 AM, Marc-Andre Lemburg
rep...@bugs.python.org wrote:
..
On this ticker, we're discussing just one
STINNER Victor victor.stin...@haypocalc.com added the comment:
Ooops, I attached the wrong patch. Here is the new fixed patch.
Without the patch:
import timeit
timeit.Timer('a'.encode('latin1')).timeit()
3.8540711402893066
timeit.Timer('a'.encode('latin-1')).timeit()
1.4946870803833008
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
On Thu, Feb 24, 2011 at 11:31 AM, Marc-Andre Lemburg
rep...@bugs.python.org wrote:
..
I think rather than removing any hyphens, spaces, etc. the
function should additionally:
* add hyphens whenever (they are missing
Steffen Daode Nurpmeso sdao...@googlemail.com added the comment:
So happy hacker haypo did it, different however. It's illegal, but since this
is a static function which only serves some specific internal strcmp(3)s it may
do for the mentioned charsets. I won't boot my laptop this evening.
Marc-Andre Lemburg m...@egenix.com added the comment:
STINNER Victor wrote:
STINNER Victor victor.stin...@haypocalc.com added the comment:
Ooops, I attached the wrong patch. Here is the new fixed patch.
That won't work, Victor, since it makes invalid encoding
names valid, e.g. 'utf(=)-8'.
Marc-Andre Lemburg m...@egenix.com added the comment:
Alexander Belopolsky wrote:
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
On Thu, Feb 24, 2011 at 11:31 AM, Marc-Andre Lemburg
rep...@bugs.python.org wrote:
..
I think rather than removing any hyphens,
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
On Thu, Feb 24, 2011 at 11:39 AM, Marc-Andre Lemburg
rep...@bugs.python.org wrote:
Marc-Andre Lemburg m...@egenix.com added the comment:
..
That won't work, Victor, since it makes invalid encoding
names valid, e.g.
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
'abc'.encode('utf(=)-8')
b'abc'
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11303
___
Ezio Melotti ezio.melo...@gmail.com added the comment:
That won't work, Victor, since it makes invalid encoding
names valid, e.g. 'utf(=)-8'.
That already works in Python (thanks to encodings.normalize_encoding).
The problem with the patch is that it makes names like 'iso88591' valid.
Éric Araujo mer...@netwok.org added the comment:
Agreed with Marc-André. It seems too magic and error-prone to do anything else
than stripping hyphens and spaces.
Steffen: This is a rather minor change in an area that is well known by several
developers, so don’t take it personally that
Steffen Daode Nurpmeso sdao...@googlemail.com added the comment:
That's ok by me.
And 'happy hacker haypo' was not ment unfriendly, i've only repeated the first
response i've ever posted back to this tracker (guess who was very fast at that
time :)).
--
Changes by Éric Araujo mer...@netwok.org:
--
resolution: - fixed
stage: - committed/rejected
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11234
___
Matthew Barnett pyt...@mrabarnett.plus.com added the comment:
It's a known issue (see issue #1662581, for example).
There's a new implementation at PyPI which doesn't have this problem:
http://pypi.python.org/pypi/regex
--
nosy: +mrabarnett
___
Éric Araujo mer...@netwok.org added the comment:
Committed to py3k as r88545. You’ll notice that I fixed the nesting of the
versionchanged directive and that I changed my mind about “returns”. Thanks
again!
--
resolution: - fixed
stage: patch review - committed/rejected
status:
New submission from Terry J. Reedy tjre...@udel.edu:
The entry for bytearray(source...) says
The optional source parameter can be used to initialize the array in a few
different ways:
...
If it is an integer, the array will have that size and will be initialized with
null bytes.
...
Without
Daniel Stutzbach stutzb...@google.com added the comment:
In what use-cases would you want to call MyABC.register() when defining a class
instead of inheriting from MyABC?
I always thought of the register() as hack to make it possible to support types
written in C, which can't inherit from the
Amaury Forgeot d'Arc amaur...@gmail.com added the comment:
--with-wctype-functions was removed in 3.2 (see issue9210, r84752)
--
nosy: +amaury.forgeotdarc
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11309
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
Committed in r88546 (3.3) and r88548 (3.2).
Note that a simple work-around before 3.2.1 is to spell encoding as 'latin-1'
or 'iso-8859-1' in pickle.loads().
--
components: +Extension Modules -Library (Lib)
Giampaolo Rodola' g.rod...@gmail.com added the comment:
I'm going to commit the patch and then watch whether some of the buildbots turn
red.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10882
Antoine Pitrou pit...@free.fr added the comment:
I've committed the part of the patch which disallows a NULL data pointer with
PyMemoryView_FromBuffer in r88550 and r88551.
--
___
Python tracker rep...@bugs.python.org
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
On Thu, Feb 24, 2011 at 3:54 PM, Antoine Pitrou rep...@bugs.python.org wrote:
..
I've committed the part of the patch which disallows a NULL data pointer
with PyMemoryView_FromBuffer in r88550 and r88551.
Is it possible
Ezio Melotti ezio.melo...@gmail.com added the comment:
The attached patch is a proof of concept to see if Steffen proposal might be
viable.
I wrote another normalize_encoding function that implements the algorithm
described in msg129259, adjusted the shortcuts and did some timings. (Note: the
Antoine Pitrou pit...@free.fr added the comment:
I've committed the part of the patch which disallows a NULL data pointer
with PyMemoryView_FromBuffer in r88550 and r88551.
Is it possible to create such buffer in Python (other than by
exploiting a bug or writing a rogue extension
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
+char lower[strlen(encoding)*2];
Is this valid in C-89?
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11303
Ezio Melotti ezio.melo...@gmail.com added the comment:
Probably not, but that part should be changed if possible, because is less
efficient than the previous version that was allocating only 11 bytes.
The problem here is that the previous versions was only changing/removing
chars, whereas
Éric Araujo mer...@netwok.org added the comment:
Someone may want to register with an ABC but not inherit methods or add a class
to the mro. It’s always been allowed by the register method; the new decorator
feature is just a very minor nicety on top of that.
Edoardo, was your request
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
It seems appropriate to consult python-dev on this. I thought
ValueError was for values that are valid Python objects but out of
acceptable range of the function. Errors that can only be triggered
in C code normally
Antoine Pitrou pit...@free.fr added the comment:
It seems appropriate to consult python-dev on this. I thought
ValueError was for values that are valid Python objects but out of
acceptable range of the function. Errors that can only be triggered
in C code normally handled with either
Antoine Pitrou pit...@free.fr added the comment:
Thanks for the new patch. Looking again, I wonder if there's a reason the
original regexp was so complicated. ldconfig output here has lines such as:
libBrokenLocale.so.1 (libc6,x86-64, OS ABI: Linux 2.6.9) =
/lib64/libBrokenLocale.so.1
New submission from Ville Skyttä ville.sky...@iki.fi:
Python 2.7 (r27:82500, Sep 16 2010, 18:02:00)
[GCC 4.5.1 20100907 (Red Hat 4.5.1-3)] on linux2
Type help, copyright, credits or license for more information.
import StringIO
StringIO.StringIO(foo).readline(0)
'foo'
I don't think this is
Jonas H. jo...@lophus.org added the comment:
As far as I can tell, it doesn't matter.
We're looking for the part after the = in any case - ignoring the
ABI/architecture information - so the regex would chose the first of those
entries.
--
___
New submission from Ville Skyttä ville.sky...@iki.fi:
http://docs.python.org/library/stdtypes.html#file.readline
An empty string is returned only when EOF is encountered immediately.
I think this sentence is misleading especially because the word only in it is
emphasized, because an empty
Nick Coghlan ncogh...@gmail.com added the comment:
A SystemError indicates that an internal API was given bogus input or produces
bogus output (i.e. we screwed up somewhere, or a third party is messing with
interfaces they shouldn't be)
If data validation fails for part of the public C API
New submission from Alexander Belopolsky belopol...@users.sourceforge.net:
In Python 3.x default encoding is always utf-8, but encode()/decode() still try
to look it up. Attached patch eliminates a call to normalize_encoding and
several strcmp() calls.
--
files: default-encode.diff
Changes by Ezio Melotti ezio.melo...@gmail.com:
--
nosy: +ezio.melotti
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11313
___
___
Changes by Antoine Pitrou pit...@free.fr:
--
nosy: +gregory.p.smith
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11314
___
___
Python-bugs-list
STINNER Victor victor.stin...@haypocalc.com added the comment:
Python 3.2 has a _posixsubprocess: some parts of subprocess are implemented in
C. Can you try it?
Python 3.2 uses also pipe2(), if available, to avoid the extra fcntl(4,
F_GETFD)+fcntl(4, F_SETFD, FD_CLOEXEC).
I suppose that the
Antoine Pitrou pit...@free.fr added the comment:
I think your analysis is wrong. These mmap() calls are for anonymous memory,
most likely they are emitted by the libc's malloc() to get some memory from the
kernel. In other words they will be blazingly fast.
I would suggest you try to dig
STINNER Victor victor.stin...@haypocalc.com added the comment:
That won't work, Victor, since it makes invalid encoding
names valid, e.g. 'utf(=)-8'.
.. but this *is* valid: ...
Ah yes, it's because of encodings.normalize_encoding(). It's funny: we have 3
functions to normalize an encoding
STINNER Victor victor.stin...@haypocalc.com added the comment:
more_aggressive_normalization.patch
Woops, normalizestring() comment points to itself.
normalize_encoding() might also points to the C implementations, at least in a
# comment.
--
___
Ezio Melotti ezio.melo...@gmail.com added the comment:
Patch looks good.
I checked the tests and couldn't fine any test for .encode()/.decode() without
encoding, so I added them in the attached patch.
--
components: +Interpreter Core
stage: - commit review
Added file:
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
Thanks for the review and the tests. I have found one more place that can be
easily optimized. (See patch below.) The decode() methods in bytes and
bytearray are not so easy unfortunately because for some reason they
Éric Araujo mer...@netwok.org added the comment:
Barry, could you try reproducing with distutils.sysconfig?
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11171
___
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
Committed issue11313.diff in revision 88553.
On the second thought, the getargs optimization is not worth the trouble
because in existing sources 'e' code is used with constant encodings and one is
unlikely to pass NULL
Barry A. Warsaw ba...@python.org added the comment:
On Feb 25, 2011, at 12:35 AM, Éric Araujo wrote:
Éric Araujo mer...@netwok.org added the comment:
Barry, could you try reproducing with distutils.sysconfig?
I'm not quite sure what you mean, but configuring Python 3.1 with different
--prefix
Ray.Allen ysj@gmail.com added the comment:
Here is the patch.
--
keywords: +patch
Added file: http://bugs.python.org/file20882/issue11287.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11287
New submission from Alexander Tsepkov atsep...@gmail.com:
in Lib/Cookie.py, BaseCookie load() method performs the following comparison on
line 624:
str(rawdata) == str()
This breaks when a unicode string is passed in for rawdata. I've included a
patch that fixes this issue by using
Alex alex.gay...@gmail.com added the comment:
Fun fact: io.StringIO does the right thing, but _io and _pyio.
--
nosy: +alex
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11311
___
Eli Bendersky eli...@gmail.com added the comment:
A slightly revised patch committed in revision 88554:
1. Fixed Éric's whitespace comment
2. Fixed a test in test_descrtut.py which was listing list's methods
3. Moved the change to collections.py onto Lib/collections/__init__.py
4. Added NEWS
Aaron Sherman a...@ajs.com added the comment:
Python 3.2 has a _posixsubprocess: some parts of subprocess are implemented in
C. Can you try it?
I don't have a Python 3 installation handy, but I can see what I can do
tomorrow evening to get one set up and try it out.
disagree with the idea
Eli Bendersky eli...@gmail.com added the comment:
Following the python-dev discussion, attaching a patch for removing fcmp and
replacing its uses with assertAlmostEqual when needed.
All tests pass and patchcheck is clean.
Please review before I commit.
--
nosy: +terry.reedy
Added
88 matches
Mail list logo