I don't compare ASCII and ISO-8859-1 decoders. I was asking if decoding
b'abc'
from ISO-8859-1 is faster than decoding b'ab\xff' from ISO-8859-1, and if
yes:
why?
No, that makes no difference.
Your patch replaces PyUnicode_New(size, 255) ... memcpy(), by
PyUnicode_FromUCS1().
You
This looks very nice. Is 3.3 a wide build? (how about a narrow build?)
It's a wide build. For reference, I also attach 64-bit narrow build
results, and 32-bit results (wide, narrow, and PEP 393). Savings are
much smaller in narrow builds (larger on 32-bit systems than on
64-bit systems).
(is
By the way, I don't know if you're working on it, but StringIO seems a
bit broken right now. test_memoryio crashes here:
test_newline_cr (test.test_memoryio.CStringIOTest) ... Fatal Python error:
Segmentation fault
Current thread 0x7f3f6353b700:
File
On Sun, Aug 28, 2011 at 21:47, Martin v. Löwis mar...@v.loewis.de wrote:
result strings. In PEP 393, a buffer must be scanned for the
highest code point, which means that each byte must be inspected
twice (a second time when the copying occurs).
This may be a silly question: are there
Le 29/08/2011 11:03, Dirkjan Ochtman a écrit :
On Sun, Aug 28, 2011 at 21:47, Martin v. Löwismar...@v.loewis.de wrote:
result strings. In PEP 393, a buffer must be scanned for the
highest code point, which means that each byte must be inspected
twice (a second time when the copying
Le 28/08/2011 23:06, Martin v. Löwis a écrit :
Am 28.08.2011 22:01, schrieb Antoine Pitrou:
- the iobench results are between 2% acceleration (seek operations),
16% slowdown for small-sized reads (4.31MB/s vs. 5.22 MB/s) and
37% for large sized reads (154 MB/s vs. 235 MB/s). The speed
Am 29.08.2011 11:03, schrieb Dirkjan Ochtman:
On Sun, Aug 28, 2011 at 21:47, Martin v. Löwis mar...@v.loewis.de wrote:
result strings. In PEP 393, a buffer must be scanned for the
highest code point, which means that each byte must be inspected
twice (a second time when the copying occurs).
Those haven't been ported to the new API, yet. Consider, for example,
d9821affc9ee. Before that, I got 253 MB/s on the 4096 units read test;
with that change, I get 610 MB/s. The trunk gives me 488 MB/s, so this
is a 25% speedup for PEP 393.
If I understand correctly, the performance now
tl;dr: PEP-393 reduces the memory usage for strings of a very small
Django app from 7.4MB to 4.4MB, all other objects taking about 1.9MB.
Am 26.08.2011 16:55, schrieb Guido van Rossum:
It would be nice if someone wrote a test to roughly verify these
numbers, e.v. by allocating lots of strings
Martin v. Löwis wrote:
tl;dr: PEP-393 reduces the memory usage for strings of a very small
Django app from 7.4MB to 4.4MB, all other objects taking about 1.9MB.
Am 26.08.2011 16:55, schrieb Guido van Rossum:
It would be nice if someone wrote a test to roughly verify these
numbers, e.v. by
On Mon, 29 Aug 2011 22:32:01 +0200
Martin v. Löwis mar...@v.loewis.de wrote:
I have now written a Django application to measure the effect of PEP
393, using the debug mode (to find all strings), and sys.getsizeof:
Le lundi 29 août 2011 21:34:48, vous avez écrit :
Those haven't been ported to the new API, yet. Consider, for example,
d9821affc9ee. Before that, I got 253 MB/s on the 4096 units read test;
with that change, I get 610 MB/s. The trunk gives me 488 MB/s, so this
is a 25% speedup for PEP
Am 26.08.2011 16:56, schrieb Guido van Rossum:
Also, please add the table (and the reasoning that led to it) to the PEP.
Done!
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
I would say no more than a 15% slowdown on each of the following
benchmarks:
- stringbench.py -u
(http://svn.python.org/view/sandbox/trunk/stringbench/)
- iobench.py -t
(in Tools/iobench/)
- the json_dump, json_load and regex_v8 tests from
http://hg.python.org/benchmarks/
I now
- the iobench results are between 2% acceleration (seek operations),
16% slowdown for small-sized reads (4.31MB/s vs. 5.22 MB/s) and
37% for large sized reads (154 MB/s vs. 235 MB/s). The speed
difference is probably in the UTF-8 decoder; I have already
restored the runs of ASCII
Am 28.08.2011 22:01, schrieb Antoine Pitrou:
- the iobench results are between 2% acceleration (seek operations),
16% slowdown for small-sized reads (4.31MB/s vs. 5.22 MB/s) and
37% for large sized reads (154 MB/s vs. 235 MB/s). The speed
difference is probably in the UTF-8 decoder; I
Le dimanche 28 août 2011 à 22:23 +0200, Martin v. Löwis a écrit :
Am 28.08.2011 22:01, schrieb Antoine Pitrou:
- the iobench results are between 2% acceleration (seek operations),
16% slowdown for small-sized reads (4.31MB/s vs. 5.22 MB/s) and
37% for large sized reads (154 MB/s vs.
Am 28.08.2011 22:01, schrieb Antoine Pitrou:
- the iobench results are between 2% acceleration (seek operations),
16% slowdown for small-sized reads (4.31MB/s vs. 5.22 MB/s) and
37% for large sized reads (154 MB/s vs. 235 MB/s). The speed
difference is probably in the UTF-8 decoder; I
But strings are allocated via PyObject_Malloc(), i.e. the custom
arena-based allocator -- isn't its overhead (for small objects) less
than 2 pointers per block?
Ah, right, I missed that. Indeed, those have no header, and the only
overhead is the padding to a multiple of 8.
That shifts the
It would be nice if someone wrote a test to roughly verify these
numbers, e.v. by allocating lots of strings of a certain size and
measuring the process size before and after (being careful to adjust
for the list or other data structure required to keep those objects
alive).
--Guido
On Fri, Aug
Also, please add the table (and the reasoning that led to it) to the PEP.
On Fri, Aug 26, 2011 at 7:55 AM, Guido van Rossum gu...@python.org wrote:
It would be nice if someone wrote a test to roughly verify these
numbers, e.v. by allocating lots of strings of a certain size and
measuring the
Stefan Behnel, 25.08.2011 23:30:
Sadly, a quick look at a couple of recent commits in the pep-393 branch
suggested that it is not even always obvious to you as the authors which
macros can be called safely and which cannot. I immediately spotted a bug
in one of the updated core functions
Am 26.08.2011 17:55, schrieb Stefan Behnel:
Stefan Behnel, 25.08.2011 23:30:
Sadly, a quick look at a couple of recent commits in the pep-393 branch
suggested that it is not even always obvious to you as the authors which
macros can be called safely and which cannot. I immediately spotted a
Martin v. Löwis, 26.08.2011 18:56:
I agree with your observation that somebody should be done about error
handling, and will update the PEP shortly. I propose that
PyUnicode_Ready should be explicitly called on input where raising an
exception is feasible. In contexts where it is not feasible
Stefan Behnel, 26.08.2011 20:28:
Martin v. Löwis, 26.08.2011 18:56:
I agree with your observation that somebody should be done about error
handling, and will update the PEP shortly. I propose that
PyUnicode_Ready should be explicitly called on input where raising an
exception is feasible. In
With this PEP, the unicode object overhead grows to 10 pointer-sized
words (including PyObject_HEAD), that's 80 bytes on a 64-bit machine.
Does it have any adverse effects?
If I count correctly, it's only three *additional* words (compared to
3.2): four new ones, minus one that is removed. In
Le 25/08/2011 06:46, Stefan Behnel a écrit :
Conversion to wchar_t* is common, especially on Windows.
That's an issue. However, I cannot say how common this really is in
practice. Surely depends on the specific code, right? How common is it
in core CPython?
Quite all functions taking text as
Hello,
On Thu, 25 Aug 2011 10:24:39 +0200
Martin v. Löwis mar...@v.loewis.de wrote:
On a 32-bit machine with a 32-bit wchar_t, pure-ASCII strings of length
1 (+NUL) will take the same memory either way: 8 bytes for the
characters in 3.2, 2 bytes in 3.3 + extra pointer + padding. Strings
of
Martin v. Löwis, 24.08.2011 20:15:
- issues to be considered (unclarities, bugs, limitations, ...)
A problem of the current implementation is the need for calling
PyUnicode_(FAST_)READY(), and the fact that it can fail (e.g. due to
insufficient memory). Basically, this means that even
Stefan Behnel, 25.08.2011 20:47:
Martin v. Löwis, 24.08.2011 20:15:
- issues to be considered (unclarities, bugs, limitations, ...)
A problem of the current implementation is the need for calling
PyUnicode_(FAST_)READY(), and the fact that it can fail (e.g. due to
insufficient memory).
On Thu, Aug 25, 2011 at 1:24 AM, Martin v. Löwis mar...@v.loewis.de wrote:
With this PEP, the unicode object overhead grows to 10 pointer-sized
words (including PyObject_HEAD), that's 80 bytes on a 64-bit machine.
Does it have any adverse effects?
If I count correctly, it's only three
Stefan Behnel, 25.08.2011 23:30:
Stefan Behnel, 25.08.2011 20:47:
Martin v. Löwis, 24.08.2011 20:15:
- issues to be considered (unclarities, bugs, limitations, ...)
A problem of the current implementation is the need for calling
PyUnicode_(FAST_)READY(), and the fact that it can fail (e.g.
Guido has agreed to eventually pronounce on PEP 393. Before that can
happen, I'd like to collect feedback on it. There have been a number
of voice supporting the PEP in principle, so I'm now interested in
comments in the following areas:
- principle objection. I'll list them in the PEP.
- issues
On Wed, 24 Aug 2011 20:15:24 +0200
Martin v. Löwis mar...@v.loewis.de wrote:
- issues to be considered (unclarities, bugs, limitations, ...)
With this PEP, the unicode object overhead grows to 10 pointer-sized
words (including PyObject_HEAD), that's 80 bytes on a 64-bit machine.
Does it have any
With this PEP, the unicode object overhead grows to 10 pointer-sized
words (including PyObject_HEAD), that's 80 bytes on a 64-bit machine.
Does it have any adverse effects?
For pure ASCII, it might be possible to use a shorter struct:
typedef struct {
PyObject_HEAD
Py_ssize_t length;
Victor Stinner, 25.08.2011 00:29:
With this PEP, the unicode object overhead grows to 10 pointer-sized
words (including PyObject_HEAD), that's 80 bytes on a 64-bit machine.
Does it have any adverse effects?
For pure ASCII, it might be possible to use a shorter struct:
typedef struct {
Martin v. Löwis, 24.08.2011 20:15:
Guido has agreed to eventually pronounce on PEP 393. Before that can
happen, I'd like to collect feedback on it. There have been a number
of voice supporting the PEP in principle
Absolutely.
- conditions you would like to pose on the implementation before
37 matches
Mail list logo