Delaney, Timothy C (Timothy) <[EMAIL PROTECTED]> wrote: > Steven Bethard wrote: > > > def filterdups(iterable): > > seen = set() > > for item in iterable: > > if item not in seen: > > seen.add(item) > > yield item > > > > Adding this to, say, itertools would cover all my use cases. And as > > long as you don't have too many duplicates, filterdups as above should > > keep memory consumption down better. > > Thinking about this further - memory usage would be almost identical. By > the time you completed the iterable, you would have built up exactly the > same set internally - although probably not as memory efficient since it > would be being built piecemeal. OTOH, an ordered set has a bit of extra > memory for maintaining the order, so it's going to be pretty close.
Definitely true that if you iterate through your entire iterable, it doesn't gain you anything over an OrderedSet. OTOH, if you only end up looking at the first N elements of the iterable, then this would be more efficient than putting the entire iterable into an OrderedSet and taking the first N from there. Of course you can put only the first elements into the OrderedSet, but note that you can't just put in the first N; you have to add elements of the iterable into the OrderedSet until it has len(obj) == N. Not that this should be more than a few lines of code, but it's code that you wouldn't have to write with fitlerdups. That said, you're right of course that filterdups is certainly not a big win over OrderedSet. I was really just trying to point out that if we were just trying to provide a solution to the filtering-duplicates-while-maintaining-order problem that OrderedSet wasn't the only path to that goal. I think since then there have been a number of other reasonable cases suggested for wanting an OrderedSet, e.g.: * A DB result set that is indexed by a key but also maintains row order [Eli Stevens, Jeff Bauer] * A config-file data structure that indexes by option names but maintains the order of elements read from a config file [John Williams] * Drop-down field selectors indexed by both name and sequence position [Jeff Bauer] So I'm now probably +0.5 on adding an OrderedSet to collections. Not that my opinion is particularly influential, of course. ;-) Steve -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com