[issue45340] Lazily create dictionaries for plain Python objects

2021-11-30 Thread Irit Katriel


Irit Katriel  added the comment:

I believe this may have caused the regression in Issue45941.

--
nosy: +iritkatriel

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45340] Lazily create dictionaries for plain Python objects

2021-10-20 Thread Mark Shannon


Mark Shannon  added the comment:

Josh, please reopen if you have more to add.

--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45340] Lazily create dictionaries for plain Python objects

2021-10-13 Thread Mark Shannon


Mark Shannon  added the comment:


New changeset a8b9350964f43cb648c98c179c8037fbf3ff8a7d by Mark Shannon in 
branch 'main':
bpo-45340: Don't create object dictionaries unless actually needed (GH-28802)
https://github.com/python/cpython/commit/a8b9350964f43cb648c98c179c8037fbf3ff8a7d


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45340] Lazily create dictionaries for plain Python objects

2021-10-12 Thread Mark Shannon


Mark Shannon  added the comment:

Josh,

I'm not really following the details of what you are saying.

You claim "Key-sharing dictionaries were accepted largely without question 
because they didn't harm code that broke them".
Is that true? I don't remember it that way. They were accepted because they 
saved memory and didn't slow things down.

This issue, proposes the same thing: less memory used, no slower or a bit 
faster.

If you are curious about how the first few instances of a class are handled, it 
is described here: 
https://github.com/faster-cpython/ideas/issues/72#issuecomment-920117600

Lazy attribute is not an issue here. How well keys are shared across instances 
depends on the dictionary implementation and was improved by 
https://github.com/python/cpython/pull/28520


It would be helpful if you could give specific examples where you think this 
change would use more memory or be slower.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45340] Lazily create dictionaries for plain Python objects

2021-10-08 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Hmm... And there's one other issue (that wouldn't affect people until they 
actually start worrying about memory overhead). Right now, if you want to 
determine the overhead of an instance, the options are:

1. Has __dict__: sys.getsizeof(obj) + sys.getsizeof(obj.__dict__)
2. Lacks __dict__ (built-ins, slotted classes): sys.getsizeof(obj)

This change would mean even checking if something using this setup has a 
__dict__ creates one. Without additional introspection support, there's no way 
to tell the real memory usage of the instance without changing the memory usage 
(for the worse).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45340] Lazily create dictionaries for plain Python objects

2021-10-08 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Hmm... Key-sharing dictionaries were accepted largely without question because 
they didn't harm code that broke them (said code gained nothing, but lost 
nothing either), and provided a significant benefit. Specifically:

1. They imposed no penalty on code that violated the code-style recommendation 
to initialize all variables consistently in __init__ (code that always ended up 
using a non-sharing dict). Such classes don't benefit, but neither do they get 
penalized (just a minor CPU cost to unshare when it realized sharing wouldn't 
work). 

2. It imposes no penalty for using vars(object)/object.__dict__ when you don't 
modify the set of keys (so reading or changing values of existing attributes 
caused no problems).

The initial version of this worsens case #2; you'd have to convert to 
key-sharing dicts, and possibly to unshared dicts a moment later, if the set of 
attributes is changed. And when it happens, you'd be paying the cost of the now 
defunct values pointer storage for the life of each instance (admittedly a 
small cost).

But the final proposal compounds this, because the penalty for lazy attribute 
creation (directly, or dynamically by modifying via vars()/__dict__) is now a 
per-instance cost of n pointers (one for each value).

The CPython codebase rarely uses lazy attribute creation, but AFAIK there is no 
official recommendation to avoid it (not in PEP 8, not in the official 
tutorial, not even in PEP 412 which introduced Key-Sharing Dictionaries). 
Imposing a fairly significant penalty on people who aren't even violating 
language recommendations, let alone language rules, seems harsh.

I'm not against this initial version (one pointer wasted isn't so bad), but the 
additional waste in the final version worries me greatly.

Beyond the waste, I'm worried how you'd handle the creation of the first 
instance of such a class; you'd need to allocate and initialize an instance 
before you know how many values to tack on to the object. Would the first 
instance use a real dict during the first __init__ call that it would use to 
realloc the instance (and size all future instances) at the end of __init__? Or 
would it be realloc-ing for each and every attribute creation? In either case, 
threading issues seem like a problem.

Seems like:

1. Even in the ideal case, this only slightly improves memory locality, and 
only provides a fixed reduction in memory usage per-instance (the dict header 
and a little allocator round-off waste), not one that scales with number of 
attributes.

2. Classes that would benefit from this would typically do better to use 
__slots__ (now that dataclasses.dataclass supports slots=True, encouraging that 
as a default use case adds little work for class writers to use them)

If the gains are really impressive, might still be worth it. But I'm just 
worried that we'll make the language penalize people who don't know to avoid 
lazy attribute creation. And the complexity of this layered:

1. Not-a-dict
2. Key-sharing-dict
3. Regular dict

approach makes me worry it will allow subtle bugs in key-sharing dicts to go 
unnoticed (because so little code would still use them).

--
nosy: +josh.r

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45340] Lazily create dictionaries for plain Python objects

2021-10-07 Thread Mark Shannon


Change by Mark Shannon :


--
keywords: +patch
pull_requests: +27125
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/28802

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45340] Lazily create dictionaries for plain Python objects

2021-10-02 Thread Dong-hee Na


Change by Dong-hee Na :


--
nosy: +corona10

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45340] Lazily create dictionaries for plain Python objects

2021-10-01 Thread Mark Shannon


New submission from Mark Shannon :

A "Normal" Python objects is conceptually just a pair of pointers, one to the 
class, and one to the dictionary.

With shared keys, the dictionary is redundant as it is no more than a pair of 
pointers, one to the keys and one to the values.

By adding a pointer to the values to the object, or embedding the values in the 
object, and fetching the keys via the class, we can avoid creating a dictionary 
for many objects.

See https://github.com/faster-cpython/ideas/issues/72 for more details.

--
assignee: Mark.Shannon
components: Interpreter Core
messages: 403010
nosy: Mark.Shannon, methane
priority: normal
severity: normal
status: open
title: Lazily create dictionaries for plain Python objects
type: enhancement
versions: Python 3.11

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com