[issue31377] remove *_INTERNED opcodes from marshal

2018-07-11 Thread INADA Naoki


INADA Naoki  added the comment:

I doubt that interning cause reproduciblity problem.

AFAIK, all strings in code object are interned or not
interned deterministically.

https://bugzilla.opensuse.org/show_bug.cgi?id=1049186
This issue seems be caused by w_ref() based on object refcnt,
not interning.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31377] remove *_INTERNED opcodes from marshal

2017-09-07 Thread Benjamin Peterson

Benjamin Peterson added the comment:

On Thu, Sep 7, 2017, at 09:46, INADA Naoki wrote:
> 
> INADA Naoki added the comment:
> 
> > We end up interning each reference individually currently.
> 
> But interning interned string is much faster. It only checks flag.
> Interning normal string requires dict lookup.

We could makes sure the version in the internal marshal memo is interned
if appropriate.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31377] remove *_INTERNED opcodes from marshal

2017-09-07 Thread INADA Naoki

INADA Naoki added the comment:

> We end up interning each reference individually currently.

But interning interned string is much faster. It only checks flag.
Interning normal string requires dict lookup.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31377] remove *_INTERNED opcodes from marshal

2017-09-07 Thread Benjamin Peterson

Benjamin Peterson added the comment:

On Thu, Sep 7, 2017, at 01:17, INADA Naoki wrote:
> 
> INADA Naoki added the comment:
> 
> w_ref() depends on refcnt already.
> I don't think removing *_INTERN opcode makes PYC reproducible.
> https://github.com/python/cpython/blob/1f06a680de465be0c24a78ea3b610053955daa99/Python/marshal.c#L269-L271

I know—we're going to have to do something about that, too. In practice,
though, the interning behavior seems to be a bigger reproducibility
problem.

> I think "intern one string, then share it 10 times" is faster than
> "share one string 10 times, then intern each of 10 references".

We end up interning each reference individually currently.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31377] remove *_INTERNED opcodes from marshal

2017-09-07 Thread INADA Naoki

INADA Naoki added the comment:

w_ref() depends on refcnt already.
I don't think removing *_INTERN opcode makes PYC reproducible.
https://github.com/python/cpython/blob/1f06a680de465be0c24a78ea3b610053955daa99/Python/marshal.c#L269-L271

I think "intern one string, then share it 10 times" is faster than
"share one string 10 times, then intern each of 10 references".

--
nosy: +inada.naoki

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31377] remove *_INTERNED opcodes from marshal

2017-09-07 Thread Benjamin Peterson

Benjamin Peterson added the comment:

Used but not really supported. Anyway, I doubt intern round-tripping is a 
particularly important.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31377] remove *_INTERNED opcodes from marshal

2017-09-07 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Marshal is used not only in pyc files. It is used for fast data serialization, 
faster than pickle, json, etc.

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31377] remove *_INTERNED opcodes from marshal

2017-09-06 Thread Benjamin Peterson

New submission from Benjamin Peterson:

The *_INTERN opcodes inform the marsahl reader to intern the encoded string 
after deserialization. I believe for pycs this is pointless because PyCode_New 
ends up interning all strings that are interesting to intern. Writing this 
opcodes makes pycs non-deterministic because the intern state may be 
inconsistent in the writer. See 
https://bugzilla.opensuse.org/show_bug.cgi?id=1049186

--
components: Interpreter Core
messages: 301569
nosy: benjamin.peterson
priority: normal
severity: normal
status: open
title: remove *_INTERNED opcodes from marshal
type: enhancement
versions: Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com