On Thu, Jun 25, 2020 at 05:27:16PM +0300, Ben Avrahami wrote: > Hey all, > Often I've found this kind of code: > > seen = set() > for i in iterable: > if i in seen: > ... # do something in case of duplicates > else: > seen.add(i) > ... # do something in case of first visit > > This kind of code appears whenever one needs to check for duplicates in > case of a user-submitted iterable, or when we loop over a recursive > iteration that may involve cycles (graph search or the like). This code > could be improved if one could ensure an item is in the set, and get > whether it was there before in one operation.
I disagree that it would be improved. The existing idiom is simple and flexible and perfectly understandable even if you don't known the gory details about how sets work. Most importantly, it matches the way people think about the task: # Task: look for duplicates if element in seen: # it's a duplicate ... else: # never seen before, so remember it ... This idiom works with sets, it works with lists, it works with trees, it works with anything that supports membership testing and inserting into a collection. It is the natural way to think about the task. Whereas this: # Check for duplicates. add element to the collection was it already there before you just added it? is a weird way to think about the problem. already_there = seen.add(element) if already_there: # handle the duplicate case Who thinks like that? *wink* Now, I agree that sometimes for the sake of efficiency, we need to do things in a weird way. But membership testing in sets is cheap, it's not a linear search of a huge list. So the efficiency argument here is weak. If you can demonstrate a non-contrived case where the efficiency argument was strong, then I might be persuaded that a new method (not `add`) could be justified. But either way, you also have to decide whether the `add` (or the new method) should *unconditionally* insert the element, or only do so if it wasn't present. This makes a big difference: seen = {2} already_there = seen.add(2.0) At this point, is `seen` the set {2} or {2.0}? Justify why one answer is the best answer. -- Steven _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/M4BDA4AR555XY666ZNU5A5O4TNYDZLGR/ Code of Conduct: http://python.org/psf/codeofconduct/