[issue17742] Add _PyBytesWriter API

2013-05-16 Thread STINNER Victor
STINNER Victor added the comment: _PyBytesWriter API makes the code slower and does not really reduce the number of lines, so I'm closing this issue as invalid. -- resolution: -> invalid status: open -> closed ___ Python tracker

[issue17742] Add _PyBytesWriter API

2013-04-27 Thread STINNER Victor
Changes by STINNER Victor : -- Removed message: http://bugs.python.org/msg187943 ___ Python tracker ___ ___ Python-bugs-list mailing l

[issue17742] Add _PyBytesWriter API

2013-04-27 Thread STINNER Victor
STINNER Victor added the comment: Advantages of the patch. * finer control on how the buffer is allocated: only overallocate if the replacement string (while handling an encoding error) is longer than 1 byte/character. The "replace" error handler should never use overallocation for example. O

[issue17742] Add _PyBytesWriter API

2013-04-27 Thread STINNER Victor
STINNER Victor added the comment: Advantages of the patch. * finer control on how the buffer is allocated: only overallocate if the replacement string (while handling an encoding error) is longer than 1 character. The "replace" error handler should never use overallocation for example. Overal

[issue17742] Add _PyBytesWriter API

2013-04-22 Thread STINNER Victor
STINNER Victor added the comment: "I expect that replacing "*p++ = c;" with "*writer.str++ = c;" would not add an important overhead, especially because writer is a local variable, and str is the first attribute of the structure. I hope that the machine code will be exactly the same." I ran some

[issue17742] Add _PyBytesWriter API

2013-04-21 Thread Antoine Pitrou
Antoine Pitrou added the comment: The last patch increases the size of the code substantially. I'm still wondering what the benefits are. $ hg di --stat Include/bytesobject.h | 90 ++ Misc/NEWS |3 + Objects/bytesobject.c | 144

[issue17742] Add _PyBytesWriter API

2013-04-20 Thread STINNER Victor
STINNER Victor added the comment: I'm not completly satisfied of bytes_writer-2.patch. Most encoders work directly on a pointer (char*). If I want to keep the code using pointers (because it is efficient), I have to resynchronize the writer and the pointer before and after calling writer metho

[issue17742] Add _PyBytesWriter API

2013-04-20 Thread STINNER Victor
STINNER Victor added the comment: > It may eventually get one, though. If a use case for the "read-only hack" comes, the hack can be added again later. It's better to start with something simple and extend it with new use cases. -- ___ Python track

[issue17742] Add _PyBytesWriter API

2013-04-20 Thread R. David Murray
R. David Murray added the comment: It may eventually get one, though. -- nosy: +r.david.murray ___ Python tracker ___ ___ Python-bugs-

[issue17742] Add _PyBytesWriter API

2013-04-20 Thread STINNER Victor
STINNER Victor added the comment: > The patch contains a special case for writing only one bytes object. > This is very unlikely case. The patch only modify a few functions to make them use the new _PyBytesWriter API. Other functions can use it. A few examples: - PyBytes_FromObject() - bina

[issue17742] Add _PyBytesWriter API

2013-04-19 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: _PyBytesWriter and _PyUnicodeWriter have differen use cases. While _PyUnicodeWriter used primary in formatter where resulting size is rarely known and reallocation in decoders usually caused by widening result, _PyBytesWriter is used only in decoders where w

[issue17742] Add _PyBytesWriter API

2013-04-18 Thread STINNER Victor
STINNER Victor added the comment: New version of the patch: - address most (all?) Serhiy's remarks - _PyBytesWriter_PrepareInternal() always use min_size, not only when overallocate is 1 - add more comments Performances are almost the same than without the patch. It looks like they are a li

[issue17742] Add _PyBytesWriter API

2013-04-18 Thread STINNER Victor
STINNER Victor added the comment: Benchmark script, should be used with: https://bitbucket.org/haypo/misc/src/tip/python/benchmark.py -- Added file: http://bugs.python.org/file29928/bench_encoders.py ___ Python tracker

[issue17742] Add _PyBytesWriter API

2013-04-18 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I have added some comments on Rietveld. -- ___ Python tracker ___ ___ Python-bugs-list mailing lis

[issue17742] Add _PyBytesWriter API

2013-04-18 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I run my own benchmarks and don't see any regression besides a random noise. Actually I see even speed up for some cases: ./python -m timeit -s "a = '\x80'*1" "a.encode()" Before patch: 29.8 usec per loop, after patch: 21.5 usec per loop. This is just a

[issue17742] Add _PyBytesWriter API

2013-04-17 Thread STINNER Victor
STINNER Victor added the comment: Here is a benchmark. It looks like the overallocation is not configured correctly for charmap and UTF-8 encoders: it is always enabled for UTF-8 encoder (whereas it should only be enabled on the first call to an error handler), and never enabled for charmap en

[issue17742] Add _PyBytesWriter API

2013-04-16 Thread Antoine Pitrou
Antoine Pitrou added the comment: > I did not run a benchmark yet. I wrote a patch to factorize the code, > not the make the code faster. Your patches don't seem to reduce the line count, so I don't understand the point. -- ___ Python tracker

[issue17742] Add _PyBytesWriter API

2013-04-16 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- nosy: +pitrou ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.pyth

[issue17742] Add _PyBytesWriter API

2013-04-16 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Could you please provide the benchmarks results? I am afraid that it may hit a performance. Results on Windows are especially interesting. -- components: +Interpreter Core stage: -> patch review type: -> enhancement

[issue17742] Add _PyBytesWriter API

2013-04-15 Thread STINNER Victor
STINNER Victor added the comment: See also #17694. -- ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://ma

[issue17742] Add _PyBytesWriter API

2013-04-15 Thread STINNER Victor
Changes by STINNER Victor : Added file: http://bugs.python.org/file29878/bytes_writer_cjkcodecs.patch ___ Python tracker ___ ___ Python-bugs-l

[issue17742] Add _PyBytesWriter API

2013-04-15 Thread STINNER Victor
New submission from STINNER Victor: In Python 3.3, I added _PyUnicodeWriter API to factorize code handling a Unicode "buffer", just the code to allocate memory and resize the buffer if needed. I propose to do the same with a new _PyBytesWriter API. The API is very similar to _PyUnicodeWriter: