[Python-ideas] Re: Access (ordered) dict by index; insert slice

Stestagg Fri, 10 Jul 2020 10:09:46 -0700

> I too have sometimes proposed what I think of as "minor quality-of-life"
> enhancements, and had them shot down. It stings a bit, and can be
> frustrating, but remember it's not personal.
>


I don't mind the shooting down, as long as the arguments make sense :D.

It seems like we're both in agreement that the cost of implementing &
maintaining the change is non-zero.
I'm asserting that the end benefit of this change is also non-zero, and in
my opinion higher than the cost. But I also acknowledge that the benefit
may not be enough to overcome the inertia behind getting a change made.

The reason I'm persevering is to try and weed out the immaterial or
incorrect reasons for not making this change, so hopefully we're left with
a good understanding of the pros-cons.


>
> The difficulty is that our QOL enhancement is someone else's bloat.
> Every new feature is something that has to be not just written once, but
> maintained, documented, tested and learned. Every new feature steepens
> the learning curve for the language; every new feature increases the
> size of the language, increases the time it takes to build, increases
> the time it takes for the tests to run.
>

Yeah, I can see that more code = more code overhead, and that's got to be
justified.
I don't believe that this feature would steepen the language learning curve
however, but actually help to shallow it slightly (Explained more below)


>
> This one might only be one new method on three classes, but it all adds
> up, and we can't add *everything*.


> (I recently started writing what was intended to be a fairly small
> class, and before I knew it I was up to six helper classes, nearly 200
> methods, and approaching 1500 LOC, for what was conceptually intended to
> be a *lightweight* object. I've put this aside to think about it for a
> while, to decide whether to start again from scratch with a smaller API,
> or just remove the word "lightweight" from the description :-)
>

Absolute method count is seldom a standalone indicator of a dead end.
Often classes with many methods, (especially if they're accompanied by lots
of helpers)
are a side-effect of some abstraction failure.  Usually when I'm consulting
on fixing projects
with these characteristics, it's a case of the developers not correctly
choosing their abstractions, or letting them leak.
It sounds like you let that one get away from you, chalk it up to a
learning experience.

The "It walks like a zoo, squaks/lows/grunts/chitters like a zoo" problem
is very real.
This is more of a "It used to be a duck. Now it walks like a duck, but
doesn't sound like a duck because it's a coot" problem

>
> So each new feature has to carry its own weight. Even if the weight in
> effort to write, effort to learn, code, tests and documentation is
> small, the benefit gained must be greater or it will likely be rejected.
>
> "Nice to have" is unlikely to be enough, unless you happen to be one of
> the most senior core devs scratching your own itch, and sometimes not
> even then.
>
>
> > >>> import numpy as np
> > >>> mapping_table = np.array(BIG_LOOKUP_DICT.items())
> > [[1, 99],
> >  [2, 23],
> >  ...
> > ]
>
> That worked in Python 2 by making a copy of the dict items into a list.
> It will equally work in Python 3 by making a copy of the items into a
> list.
>
> And I expect that even if dict.items() was indexable, numpy would
> still have to copy the items. I don't know how numpy works in detail,
> but I doubt that it will be able to use a view of a hash table internals
> as a fast array without copying.
>
> Bottom line here is that adding indexing to dict views won't save you
> either time or memory or avoid making a copy in this example. All it
> will save you is writing an explicit call to `list`. And we know what
> the Zen says about being explicit.
>

What making dict_* types a Sequence will do is make this code (as written)
behave:
1. like it used to do
2. like most people seem to expect it to.

Currently numpy does something that I consider unexpected (I'm sure, given
your previous responses, you'd disagree with this, but from canvassing
Python devs, I feel like many people share my opinions here)
with that code.


>
>
> > >>> import sqlite3
> > >>> conn = sqlite3.connect(":memory:")
> > >>> params = {'a': 1, 'b': 2}
> > >>> placeholders = ', '.join(f':{p}' for p in params)
> > >>> statement = f"select {placeholders}"
> > >>> print(f"Running: {statement}")
> > Running: select :a, :b
> > >>> cur=conn.execute(statement, params.values())
> > >>> cur.fetchall()
> > [(1, 2)]
>
> Why are you passing a view to a values when you could pass the dict
> itself? Is there some reason you don't do this?
>
>     # statement = "select :a, :b"
>     py> cur=conn.execute(statement, params)
>     py> cur.fetchall()
>     [(1, 2)]
>
> I'm not an expert on sqlite, so I might be missing something here, but I
> would have expected that this is the prefered solution. It matches the
> example in the docs, which uses a dict.
>
>
You're right, this was a version of code that I've written before for a
different database driver (which didn't support named parameters), and
sqlite3 does support that, so that my mistake. As mentioned elsewhere,
producing bullet-proof use-cases on demand can be tough.



> > # This currently works, but is deprecated in 3.9
> > >>> dict(random.sample({'a': 1, 'b': 2}.items(), 2))
> > {'b': 2, 'a': 1}
>
> I suspect that even if dict items were indexable, Raymond Hettinger
> would not be happy with random.sample on dict views.
>

I don't know why? I can understand deprecating sets here as they're
unordered, so the results when seed() has been called are not consistent.
I don't see why Raymond would object to allowing sampling an ordered
container, one from which the results will be reproducible.


>
> > >>>  def min_max_keys(d):
> > >>>      min_key, min_val = d.items()[0]
> > >>>      max_key, max_val = min_key, min_val
> > >>>      for key, value in d.items():
>
> Since there's no random access to the items required, there's not really
> any need for indexing. You only need the first item, then iteration. So
> the natural way to write that is with iter() and next().
>

Yeah, it's possible to write this in any number of ways.

I canvassed some opinions from python developers on how to do this sort of
thing, and of 4 out of 5 different responses I got wouldn't currently work
because their suggested implementations relied on `.keys() being indexable,
or being Sequences.



>
> I suspect that the difference in perspective here is that (perhaps?) you
> still thing of concrete sequences and indexing as fundamental, while
> Python 3 has moved in the direction of making the iterator protocol and
> iterators as fundamental.
>

This is the proposal, I want to make these things Sequences.

These things (the results of dict.keys() for example) used to look like,
and act, like nails. Then suddenly they looked like and acted like screws
(for good reasons), but let's say screws with smooth heads, (as many people
think they are still nails), now there's a simple way to make them act like
nails again, using "but they're screws, so can't be nails" as counter
doesn't hold water.


>
> You have a hammer (indexing), so you want views to be nails so you can
> hammer them. But views are screws, and need a screwdriver (iter and
> next).
>
>
I have a proposal: make these things indexable, so people can hammer them
in if they desire.



> ...not that we can jump to the 350th key without
> stepping through the previous 349 keys.
>

The existing dictionary memory layout doesn't support direct indexing
(without stepping), so this functionality is not being added as a
requirement.


>
> Dicts have gone through a number of major redesigns and many careful
> tweaks over the years to get the best possible performance. The last
> major change was to add *order-preserving* behaviour, not indexing. The
> fact that they can be indexed in reasonable time is not part of the
> design, just an accident of implementation, and being an accident, it
> could change in the future.
>

To throw the request back, what's the use case you're considering here? Why
would dictionary iteration be made slower in the future?


> This feature would require upgrading that accident of implementation to
> a guarantee. If the Python world were awash with dozens of compelling,
> strong use-cases for indexing dicts, then we would surely be willing to
> make that guarantee. But the most compelling use-case we've seen so far
> is awfully weak indeed: choose a random item from a dict.
>

> So the cost-benefit calculation goes (in my opinion) something like
> this.
>
> 1. Risk of eliminating useful performance enhancements in the
>    future: small.
>

No use cases on how/why this would be a thing


>
> 2. Benefit gained: even smaller.
>

Some weak use cases on why this would help.

>
>
> That's not FUD. It's just a simple cost-benefit calculation. You can
> counter it by finding good use-cases that are currently difficult and
> annoying to solve. Using an explicit call to list is neither difficult
> nor annoying :-)
>
>

_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/TSWIG7PAQ25727XTVB5ZDICHPDZ2EXRI/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

Reply via email to