Control: forwarded -1 https://github.com/python/cpython/issues/92132
Hi, On Fri, 29 Apr 2022 21:00:52 -0700 Keith Amling <m...@amling2.org> wrote: > From skimming some of cpython's "marshal" code [1] my best guess is that > first difference is between it thinking the `_m` string might have > another reference to it (and thus adding 0x80, or FLAG_REF to it) or > not. This seems driven by whether or not python's object for the string > has other references (it calls Py_REFCNT(v) to decide, see line 302). > > I assume the difference is whether or not python has bothered to collect > some other reference to the string or not. Type "Z" is an interned > string type, TYPE_SHORT_ASCII_INTERNED, which therefore makes sense that > it might be shared with who knows what else. I'm assuming this stops > reproducing when you change it to a unique name since no one else will share > the reference and you'll just deterministically get no FLAG_REF. thank you! It was indeed about that line and there exists a pull request upstream that fixes this issue: https://github.com/python/cpython/pull/8226 Specifically, the following patch to python3.10 in Debian seems to solve this. I also attached a full debdiff for your convenience. Thanks! cheers, josch >From 6c8ea7c1dacd42f3ba00440231ec0e6b1a38300d Mon Sep 17 00:00:00 2001 From: Inada Naoki <songofaca...@gmail.com> Date: Sat, 14 Jul 2018 00:46:11 +0900 Subject: [PATCH] Use FLAG_REF always for interned strings --- Python/marshal.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) --- a/Python/marshal.c +++ b/Python/marshal.c @@ -298,9 +298,14 @@ w_ref(PyObject *v, char *flag, WFILE *p) if (p->version < 3 || p->hashtable == NULL) return 0; /* not writing object references */ - /* if it has only one reference, it definitely isn't shared */ - if (Py_REFCNT(v) == 1) + /* If it has only one reference, it definitely isn't shared. + * But we use TYPE_REF always for interned string, to PYC file stable + * as possible. + */ + if (Py_REFCNT(v) == 1 && + !(PyUnicode_CheckExact(v) && PyUnicode_CHECK_INTERNED(v))) { return 0; + } entry = _Py_hashtable_get_entry(p->hashtable, v); if (entry != NULL) {
diff -Nru python3.10-3.10.4/debian/changelog python3.10-3.10.4/debian/changelog --- python3.10-3.10.4/debian/changelog 2022-04-02 11:04:19.000000000 +0200 +++ python3.10-3.10.4/debian/changelog 2022-05-02 13:25:08.000000000 +0200 @@ -1,3 +1,10 @@ +python3.10 (3.10.4-3.1) UNRELEASED; urgency=medium + + * Non-maintainer upload. + * Add patch to set FLAG_REF always for interned strings. Closes: #1010368 + + -- Johannes Schauer Marin Rodrigues <jo...@debian.org> Mon, 02 May 2022 13:25:08 +0200 + python3.10 (3.10.4-3) unstable; urgency=medium * Build a python3.10-nopie package, diverting the python3.10 diff -Nru python3.10-3.10.4/debian/patches/series python3.10-3.10.4/debian/patches/series --- python3.10-3.10.4/debian/patches/series 2022-04-02 11:04:19.000000000 +0200 +++ python3.10-3.10.4/debian/patches/series 2022-05-02 13:24:24.000000000 +0200 @@ -39,3 +39,4 @@ fix-py_compile.diff reproducible-pyc.diff fix-ia64.diff +use-FLAG_REF-always-for-interned-strings.diff diff -Nru python3.10-3.10.4/debian/patches/use-FLAG_REF-always-for-interned-strings.diff python3.10-3.10.4/debian/patches/use-FLAG_REF-always-for-interned-strings.diff --- python3.10-3.10.4/debian/patches/use-FLAG_REF-always-for-interned-strings.diff 1970-01-01 01:00:00.000000000 +0100 +++ python3.10-3.10.4/debian/patches/use-FLAG_REF-always-for-interned-strings.diff 2022-05-02 13:25:07.000000000 +0200 @@ -0,0 +1,37 @@ +From 5fe4c1442ded6bdc4c724935d118e996fa022eac Mon Sep 17 00:00:00 2001 +From: Inada Naoki <songofaca...@gmail.com> +Date: Sat, 14 Jul 2018 00:46:11 +0900 +Subject: [PATCH] Use FLAG_REF always for interned strings + +https://bugs.debian.org/1010368 +https://github.com/python/cpython/pull/8226 +https://github.com/python/cpython/issues/92132 + +--- + Python/marshal.c | 9 +++++++-- + 1 file changed, 7 insertions(+), 2 deletions(-) + +diff --git a/Python/marshal.c b/Python/marshal.c +index 4125240..341c9aa 100644 +--- a/Python/marshal.c ++++ b/Python/marshal.c +@@ -298,9 +298,14 @@ w_ref(PyObject *v, char *flag, WFILE *p) + if (p->version < 3 || p->hashtable == NULL) + return 0; /* not writing object references */ + +- /* if it has only one reference, it definitely isn't shared */ +- if (Py_REFCNT(v) == 1) ++ /* If it has only one reference, it definitely isn't shared. ++ * But we use TYPE_REF always for interned string, to PYC file stable ++ * as possible. ++ */ ++ if (Py_REFCNT(v) == 1 && ++ !(PyUnicode_CheckExact(v) && PyUnicode_CHECK_INTERNED(v))) { + return 0; ++ } + + entry = _Py_hashtable_get_entry(p->hashtable, v); + if (entry != NULL) { +-- +2.35.1 +
signature.asc
Description: signature