[issue39109] [C-API] PyUnicode_FromString

2019-12-20 Thread STINNER Victor


STINNER Victor  added the comment:

> I think the ob_refcnt Field should be 1 in both cases. Or why is the refcnt 
> here so high?

Python has singletons for short strings: empty string and 1-character latin1 
characters (unicode range [U+; U+00FF]).

Examples:

>>> sys.getrefcount("")
103
>>> sys.getrefcount("a")
11

It's not a bug, but an optimization to reduce the memory footprint ;-)

--
nosy: +vstinner
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39109] [C-API] PyUnicode_FromString

2019-12-20 Thread Yannick


New submission from Yannick :

Python version: 3.5
Tested with VS Studio 2017 in an C-API extension.

When you have a UTF-8 encoded char buffer which is filled with a 0 or empty, 
and you youse the PyUnicode_FromString() method on this buffer, you will get a 
PyObject*. The content looks good, but the refence counter looks strange. 

In case of an 0 as char in the buffer, the ob_refcnt Field is set to 100 and in 
case of an empty buffer, the ob_refcnt Field is set to something around 9xx. 

Example Code: 
  string s1 = u8"";
  string s2 = u8"0";

  PyObject *o1 = PyUnicode_FromString(s1.c_str());
  //o1->ob_refcnt = 9xx
  PyObject *o2 = PyUnicode_FromString(s2.c_str());
  //o2->ob_refcnt = 100

I think the ob_refcnt Field should be 1 in both cases. Or why is the refcnt 
here so high?

--
components: C API
messages: 358706
nosy: YannickSchmetz
priority: normal
severity: normal
status: open
title: [C-API] PyUnicode_FromString
type: behavior
versions: Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com