On Fri, Jul 31, 2020 at 7:34 AM Guido van Rossum <gu...@python.org> wrote:

> So maybe we need to add dict.ordered() which returns a view on the items
> that is a Sequence rather than a set? Or ordereditems(), orderedkeys() and
> orderedvalues()?
>

I'm still confused as to when "ordered" became synonymous with "Sequence"
-- so wouldn't we want to call these dict.as_sequence() or something like
that?

And is there a reason that the regular dict views couldn't be both a Set
and a Sequence? Looking at the ABCs, I don't see a conflict -- __getitem__,
index() and count() would need to be added, and  Set's don't have any of
those. (and count could be optimized to always return 0 or 1 for
dict.keys() ;-) )

But anyway, naming aside, I'm still wondering whether we necessarily want
the entire Sequence protocol. For the use cases at hand, isn't indexing and
slicing enough?

Which brings us to the philosophy of duck typing. I wrote an earlier post
about that -- so here's some follow up thoughts. I suggested that I like
the "if I only need it to quack, I don't care if it's a duck" approach -- I
try to use the quack() method, and I'm happy it if works, and raise an
Exception (Or let whatever Exception happens be raised bubble up) if it
doesn't.

Guido pointed out that having a quack() method isn't enough -- it also
needs to actually behave as you expect -- which is the nice thing about
ABCs -- if you know something is a Sequence, you don't just know that you
can index it, you know that indexing it will do what you expect.

Which brings us back to the random.choice() function. It's really simple,
and uses exactly the approach I outlined above.

    def choice(self, seq):
        """Choose a random element from a non-empty sequence."""
        try:
            i = self._randbelow(len(seq))
        except ValueError:
            raise IndexError('Cannot choose from an empty sequence') from
None
        return seq[i]

It checks the length of the object, picks a random index within that
length, and then tries to use that index to get a random item. so anything
with a __len__ and a __getitem__ that accepts integers will work.

And this has worked "fine" for decades. Should it be checking that seq is
actually a sequence? I don't think so -- I like that I can pass in any
object that's indexable by an integer.

But there's is a potential problem here -- all it does is try to pass an
integer to __getitem__. So all Sequences should work. But Mappings also
have a __getitem__, but with slightly different semantics -- a Sequence
should accept an integer (or object with an __index__) in the range of its
size, but a Mapping can accept any valid key. So for the most part, passing
a Mapping to random.choice() fails as it should, with a KeyError. But if
you happen to have a key that is an integer, it might succeed, but it would
not be doing "the right thing" (unless the Mapping happened to be
constructed exactly the right way -- but then it should probably just be a
Sequence).

So: do we need a solution to this? I don't think so, it's simply the nature
of a dynamic typing as far as I'm concerned, but if we wanted it to be more
robust, we could require (maybe only with a static type declaration) that
the object passed in is a Sequence.

But I think that would be a shame -- this function doesn't need a full
Sequence, it only needs a Sized and __getitem__.

In fact, the ABCs are designed to accommodate much of this -- for example,
the Sized ABC only requires one feature: __len__. And Contains only
__contains__. As far as I know there are no built-ins (or commonly used
third party) objects that are ONLY Sized, or ONLY Contains. In fact, at
least in the collection.abc, every ABC that has __contains__ also has
__len__. And I can't think of anything that could support "in" that didn't
have a size -- which could be a failure of imagination on my part. But you
could type check for Contains is all you wanted to do was know that you
could use it with "in".

So there are ABCs there simply to support a single method. Which means that
we could solve the "problem" of random.choice with a "Getitemable" ABC.

Ahh -- but here's the rub -- while the ABCs only require certain methods --
in fact, it's implied that they have particular behavior as well. And this
is the problem at hand. Both Sequences and Mappings have a __getitem__, but
they have somewhat different meanings, and that meaning is embedded in the
ABC itself, rather than the method: Sequences will take an integer, and
raise a IndexError if its out of range, and Mappings take any hashable, and
will raise a KeyError if it's not there.

So maybe what is needed is an Indexable ABC that implies the Sequence-like
indexing behavior.

Then if we added indexing to dict views, they would be an Indexable, but
not a Sequence.

-CHB











> On Fri, Jul 31, 2020 at 05:29 Ricky Teachey <ri...@teachey.org> wrote:
>
>> On Fri, Jul 31, 2020, 2:48 AM Wes Turner <wes.tur...@gmail.com> wrote:
>>
>>> # Dicts and DataFrames
>>> - Src:
>>> https://github.com/westurner/pythondictsanddataframes/blob/master/dicts_and_dataframes.ipynb
>>> - Binder:
>>> https://mybinder.org/v2/gh/westurner/pythondictsanddataframes/master?filepath=dicts_and_dataframes.ipynb
>>>   - (interactive Jupyter Notebook hosted by https://mybinder.org/ )
>>>
>>
>> The punchline of Wes Turner's notebook (very well put together, thank
>> you!) seems to partly be that if you find yourself wanting to work with the
>> position of items in a dict, you might want to consider using a
>> pandas.Series (with it's .iloc method).
>>
>> A difficulty that immediately came to mind with this advice is type
>> hinting support. I was just googling yesterday for "how to type hint using
>> pandas" and the only thing I found is to use pd.Series and pd.DataFrame
>> directly.
>>
>> But those don't support type hinting comparable to:
>>
>> Dict[str, float]
>>
>> Or:
>>
>> class Vector(TypedDict):
>>     i: float
>>     j: float
>>
>> This is a big downside of the advice "just use pandas". Although I love
>> using pandas and use it all the time.
>> _______________________________________________
>> Python-ideas mailing list -- python-ideas@python.org
>> To unsubscribe send an email to python-ideas-le...@python.org
>> https://mail.python.org/mailman3/lists/python-ideas.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/python-ideas@python.org/message/C7HJFKB67U74SULO6OUTLWST2MHZERCH/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
> --
> --Guido (mobile)
> _______________________________________________
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/VIPBHJTMFGREFQHINDNODAAJGNE2IDJB/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/AAZALPS2O5WWEHQH23VSI6C7CWSYJ2PM/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to