New submission from STINNER Victor: In Python 3.3, I added _PyUnicodeWriter API to factorize code handling a Unicode "buffer", just the code to allocate memory and resize the buffer if needed.
I propose to do the same with a new _PyBytesWriter API. The API is very similar to _PyUnicodeWriter: * _PyBytesWriter_Init(writer) * _PyBytesWriter_Prepare(writer, count) * _PyBytesWriter_WriteStr(writer, bytes_obj) * _PyBytesWriter_WriteChar(writer, ch) * _PyBytesWriter_Finish(writer) * _PyBytesWriter_Dealloc(writer) The patch changes ASCII, Latin1, UTF-8 and charmap encoders to use _PyBytesWriter API. A second patch changes CJK encoders. I did not run a benchmark yet. I wrote a patch to factorize the code, not the make the code faster. Notes on performances: * I peek the "small buffer allocated on the stack" idea from UTF-8 encoder, but the smaller buffer is always 500 bytes (instead of a size depending on the Unicode maximum character of the input Unicode string) * _PyBytesWriter overallocates by 25% (when overallocation is enabled), whereas charmap encoders doubles the buffer: /* exponentially overallocate to minimize reallocations */ if (requiredsize < 2*outsize) requiredsize = 2*outsize; * I didn't check if the allocation size is the same with the patch. min_size and overallocate attributes should be set correctly to not make the code slower. * The code writing a single into a _PyUnicodeWriter buffer is inlined in unicodeobject.c. _PyBytesWriter API does not provide inlined function for the same purpose. ---------- files: bytes_writer.patch keywords: patch messages: 187035 nosy: haypo, serhiy.storchaka priority: normal severity: normal status: open title: Add _PyBytesWriter API versions: Python 3.4 Added file: http://bugs.python.org/file29877/bytes_writer.patch _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue17742> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com