[issue39087] [C API] No efficient C API to get UTF-8 string from unicode object.

2020-03-14 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I though there are at least 3-4 use cases in the core and stdlib. -- ___ Python tracker ___

[issue39087] [C API] No efficient C API to get UTF-8 string from unicode object.

2020-03-14 Thread Inada Naoki
Change by Inada Naoki : -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker ___ ___

[issue39087] [C API] No efficient C API to get UTF-8 string from unicode object.

2020-03-14 Thread Inada Naoki
Inada Naoki added the comment: New changeset 3a8c56295d6272ad2177d2de8af4c3f824f3ef92 by Inada Naoki in branch 'master': Revert "bpo-39087: Add _PyUnicode_GetUTF8Buffer()" (GH-18985) https://github.com/python/cpython/commit/3a8c56295d6272ad2177d2de8af4c3f824f3ef92 --

[issue39087] [C API] No efficient C API to get UTF-8 string from unicode object.

2020-03-13 Thread Inada Naoki
Change by Inada Naoki : -- pull_requests: +18333 pull_request: https://github.com/python/cpython/pull/18985 ___ Python tracker ___

[issue39087] [C API] No efficient C API to get UTF-8 string from unicode object.

2020-03-13 Thread Inada Naoki
Inada Naoki added the comment: I'm sorry about merging PR 18327, but I can not find enough usage example of the _PyUnicode_GetUTF8Buffer. PyUnicode_AsUTF8AndSize is optimized, and utf8_cache is not so bad in most case. So _PyUnicode_GetUTF8Buffer seems not worth enough. I will revert PR

[issue39087] [C API] No efficient C API to get UTF-8 string from unicode object.

2020-03-13 Thread Inada Naoki
Change by Inada Naoki : -- pull_requests: +18332 pull_request: https://github.com/python/cpython/pull/18984 ___ Python tracker ___

[issue39087] [C API] No efficient C API to get UTF-8 string from unicode object.

2020-03-13 Thread Inada Naoki
Inada Naoki added the comment: New changeset c7ad974d341d3edb6b9d2a2dcae4d3d4794ada6b by Inada Naoki in branch 'master': bpo-39087: Add _PyUnicode_GetUTF8Buffer() (GH-17659) https://github.com/python/cpython/commit/c7ad974d341d3edb6b9d2a2dcae4d3d4794ada6b --

[issue39087] [C API] No efficient C API to get UTF-8 string from unicode object.

2020-02-26 Thread Inada Naoki
Inada Naoki added the comment: New changeset 02a4d57263a9846de35b0db12763ff9e7326f62c by Inada Naoki in branch 'master': bpo-39087: Optimize PyUnicode_AsUTF8AndSize() (GH-18327) https://github.com/python/cpython/commit/02a4d57263a9846de35b0db12763ff9e7326f62c --

[issue39087] [C API] No efficient C API to get UTF-8 string from unicode object.

2020-02-03 Thread Inada Naoki
Inada Naoki added the comment: Attached patch is the benchmark function I used in previous post. -- Added file: https://bugs.python.org/file48879/bench-asutf8.patch ___ Python tracker

[issue39087] [C API] No efficient C API to get UTF-8 string from unicode object.

2020-02-03 Thread Inada Naoki
Inada Naoki added the comment: I am still not sure about we should add new API only for avoiding cache. * PyUnicode_AsUTF8String : When we need bytes or want to avoid cache. * PyUnicode_AsUTF8AndSize : When we need C string, and cache is acceptable. With PR-18327, PyUnicode_AsUTF8AndSize

[issue39087] [C API] No efficient C API to get UTF-8 string from unicode object.

2020-02-03 Thread Inada Naoki
Change by Inada Naoki : -- pull_requests: +17701 pull_request: https://github.com/python/cpython/pull/18327 ___ Python tracker ___

[issue39087] [C API] No efficient C API to get UTF-8 string from unicode object.

2019-12-24 Thread Inada Naoki
Inada Naoki added the comment: > I like this idea, but I think that we should at least notify Python-Dev about > all additions to the public C API. If somebody have objections or better > idea, it is better to know earlier. I created a post about this issue in discuss.python.org.

[issue39087] [C API] No efficient C API to get UTF-8 string from unicode object.

2019-12-23 Thread Inada Naoki
Change by Inada Naoki : -- pull_requests: +17140 pull_request: https://github.com/python/cpython/pull/17683 ___ Python tracker ___

[issue39087] [C API] No efficient C API to get UTF-8 string from unicode object.

2019-12-21 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I like this idea, but I think that we should at least notify Python-Dev about all additions to the public C API. If somebody have objections or better idea, it is better to know earlier. -- ___ Python tracker

[issue39087] [C API] No efficient C API to get UTF-8 string from unicode object.

2019-12-19 Thread Inada Naoki
Change by Inada Naoki : -- keywords: +patch pull_requests: +17127 stage: -> patch review pull_request: https://github.com/python/cpython/pull/17659 ___ Python tracker ___

[issue39087] [C API] No efficient C API to get UTF-8 string from unicode object.

2019-12-19 Thread Inada Naoki
Inada Naoki added the comment: > Don't you need to DECREF bytes somehow, at least, in case of failure? Thanks. I will create a pull request with suggested changes. -- ___ Python tracker

[issue39087] [C API] No efficient C API to get UTF-8 string from unicode object.

2019-12-19 Thread STINNER Victor
STINNER Victor added the comment: return PyBytesType.tp_as_buffer(bytes, view, PyBUF_CONTIG_RO); Don't you need to DECREF bytes somehow, at least, in case of failure? -- ___ Python tracker

[issue39087] [C API] No efficient C API to get UTF-8 string from unicode object.

2019-12-19 Thread Inada Naoki
Inada Naoki added the comment: s/return NULL/return -1/g -- ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue39087] [C API] No efficient C API to get UTF-8 string from unicode object.

2019-12-19 Thread Inada Naoki
Inada Naoki added the comment: > Would it be possible to use a "container" object like a Py_buffer? Is there a > way to customize the code executed when a Py_buffer is "released"? It looks nice idea! Py_buffer.obj is decref-ed when releasing the buffer.

[issue39087] [C API] No efficient C API to get UTF-8 string from unicode object.

2019-12-19 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > Would it be possible to use a "container" object like a Py_buffer? Looks like a good idea. int PyUnicode_GetUTF8Buffer(Py_buffer *view, const char *errors) -- nosy: +skrah ___ Python tracker

[issue39087] [C API] No efficient C API to get UTF-8 string from unicode object.

2019-12-19 Thread STINNER Victor
STINNER Victor added the comment: > The returned object is the owner of the *utf8*. You need to Py_DECREF() it > after > you finished to using the *utf8*. The owner may be not the unicode. Would it be possible to use a "container" object like a Py_buffer? Is there a way to customize the

[issue39087] [C API] No efficient C API to get UTF-8 string from unicode object.

2019-12-19 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Do you mean some concrete code? Several times I wished similar feature. To get a UTF-8 cache if it exists and encode to UTF-8 without creating a cache otherwise. The private _PyUnicode_UTF8() macro could help if ((s = _PyUnicode_UTF8(str))) { size =

[issue39087] [C API] No efficient C API to get UTF-8 string from unicode object.

2019-12-18 Thread STINNER Victor
Change by STINNER Victor : -- title: No efficient API to get UTF-8 string from unicode object. -> [C API] No efficient C API to get UTF-8 string from unicode object. ___ Python tracker