Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
Anthony Baxter wrote: On Thursday 10 March 2005 17:29, Raymond Hettinger wrote: Or the implementation can have a switch to choose between keep-first logic or replace logic. The latter seems a bit odd to me. The key position would be determined by the first encountered while the value would be determined by the last encountered. Starting with [(10, v1), (20, v2), (10.0, v3)], the ordered dictionary's items would look like [(10, v3), (20, v2)]. Or, alternately, we keep the last occurence, and move it to the new position. There's a wide number of different use cases, each with a slightly different final result, and for this reason alone I'm -0 on it for the library. Anthony But surely such use cases could be more easily handled by subclassing from collections.OrderedDict and tweaking the semantics than by having to implement an ordered mapping from scratch. Would the default semantics below really be that suprising? "An ordered dictionary remembers the order in which keys are first seen and, when iterated over, returns entries based on that order. This applies to direct iteration, iteration over values and (key, value) pairs, to the list-producing methods (i.e. keys(), values() and items()) and to any other operations that involve implicit iteration (e.g. converting to a string representation). Overwriting an entry replaces its value, but does not affect its position in the key order. Removing an entry (using 'del') _does_ remove it from the key order. Accordingly, if the entry is later recreated, it will then occur last in the key order. This behaviour is analagous to that of a list constructed using only list.append() to add items (indeed, the key order can be thought of as a list constructed in this fashion, with keys appended to the list when they are first encountered). An ordered dictionary provides a sort() method. The sort operation is applied to the key ordering and affects future iteration over the dictionary. Again, this is analagous to sorting a list." For instance, to convert a standard dictionary to an ordered dictionary using a specific key function: x = collections.OrderedDict(sorted(d.itervalues(), key=keyfunc)) Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://boredomandlaziness.skystorm.net ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
"Raymond Hettinger" <[EMAIL PROTECTED]> wrote: > > [BJörn Lindqvist] > > I would LOVE for **kwargs to be an ordered dict. It would allow me to > > write code like this: > > > > .class MyTuple: > > .def __init__(self, **kwargs): > > .self.__dict__ = ordereddict(kwargs) > > This doesn't work. The kwargs are already turned into a regular > dictionary before ordereddict sees it. From what I understand, he was saying that it would be nice if kwargs were an ordered dict /instead of/ a standard dict. Whether or not he realizes it will not happen due to the 2x memory overhead, 2x speed hit, etc., every time kwargs are used, is another matter. Alternatively, BJorn could use a list of tuples and *args to preserve order, but that is an off-list discussion for another day. - Josiah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
RE: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
[BJörn Lindqvist] > I would LOVE for **kwargs to be an ordered dict. It would allow me to > write code like this: > > .class MyTuple: > .def __init__(self, **kwargs): > .self.__dict__ = ordereddict(kwargs) This doesn't work. The kwargs are already turned into a regular dictionary before ordereddict sees it. Raymond Hettinger ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
On Thursday 10 March 2005 17:29, Raymond Hettinger wrote: > Or the implementation can have a switch to choose between keep-first > logic or replace logic. > > The latter seems a bit odd to me. The key position would be determined > by the first encountered while the value would be determined by the last > encountered. Starting with [(10, v1), (20, v2), (10.0, v3)], the > ordered dictionary's items would look like [(10, v3), (20, v2)]. Or, alternately, we keep the last occurence, and move it to the new position. There's a wide number of different use cases, each with a slightly different final result, and for this reason alone I'm -0 on it for the library. Anthony -- Anthony Baxter <[EMAIL PROTECTED]> It's never too late to have a happy childhood. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
I would LOVE for **kwargs to be an ordered dict. It would allow me to write code like this: .class MyTuple: .def __init__(self, **kwargs): .self.__dict__ = ordereddict(kwargs) . .def __iter__(self): .for k, v in self.__dict__.items(): .yield v . .t = MyTuple(r = 99, g = 12, b = 4) .r, g, b = t .print r, g, b I know it goes beyond the original intention of the proposal, but I figure I'd mention it anyway because the unorder of **kwargs has been bugging me alot. -- mvh Björn ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
RE: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
On Thu, 2005-03-10 at 01:29, Raymond Hettinger wrote: > Or the implementation can have a switch to choose between keep-first > logic or replace logic. This is what I meant by my previous follow up: while the concept of "an ordered dictionary" is nice and seemingly generic enough, in practice I suspect that exact semantics and other design factors will either tend to push the stdlib implementation into ever more complexity, or won't prevent people from continuing to roll their own because the stdlib version "isn't quite right". -Barry signature.asc Description: This is a digitally signed message part ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
On Wed, 2005-03-09 at 19:39, Tommy Burnette wrote: > I'd say I'm +0. fwiw- I've been using a locally-rolled OrderedDict > implementation for the last 5-6 years in which insertion order is the > only order respected. I use it all over the place (in a code base of > ~60k lines of python code). > > so there's another use case for you. bust as you say, easy to do > yourself... I'm -0 on adding it to the stdlib, but mostly because I don't like the name, and I suspect it's going to be one of those nuggets lurking in the standard library that few people will use, tending either to overlook it or just roll their own anyway because the standard one doesn't have quite the right semantics. FWIW, email.Message.Message /exposes/ an ordered dictionary-like interface even though it's implemented as a simple list. It was considered at the time that the number of headers in an email message wouldn't be so large that anything else would be worth the complexity. I think that still holds, for the original uses cases at least. -Barry signature.asc Description: This is a digitally signed message part ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
Delaney, Timothy C (Timothy) wrote: OTOH, "ordered set" and "ordered dict" implies different things to different people - usually "sorted" rather than "the order things were put in". Perhaps "temporally-ordered" ;) OTGH*, I would expect an OrderedDict / OrderedSet to have 'add to the end' semantics, but also provide a 'sort()' method so that the ordering could be changed at a later date. IOW, by default the ordering is temporal. Sorting the ordered dict/set changes the iteration order for the current contents. Further additions are still added in temporal order until such time as the dict/set is sorted again. The parallels are to using list.append() to build a list, and list.sort() to order the current contents (in fact, a simplistic approach could use that exact technique to remember the order of keys, at the cost of doubling key storage requirements). Cheers, Nick. *OTGH: On the gripping hand :) -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://boredomandlaziness.skystorm.net ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
"Delaney, Timothy C (Timothy)" <[EMAIL PROTECTED]> writes: > Set: Items are iterated over in the order that they are added. Adding an > item that compares equal to one that is already in the set does not > replace the item already in the set, and does not change the iteration > order. Removing an item, then re-adding it moves the item to the end of > the iteration order. Well, this could be satisfied by an append_new operation on lists, right (thinking of Common Lisps #'cl:pushnew)? Complexity not that great, of course, but I've written code like: if a not in l: l.append(a) and not suffered that badly for it before now... > Dict: Keys are iterated over in the order that they are added. Setting a > value using a key that compares equal to one already in the dict > replaces the value, but not the key, and does not change the iteration > order. Removing a key (and value) then re-adding it moves the key to the > end of the iteration order. And these are what CL types call association lists, in effect. Cheers, mwh -- #ifndef P_tmpdir printf( "Go buy a better computer" ); exit( ETHESKYISFALLINGANDIWANTMYMAMA ); -- Dimitri Maziuk on writing secure code, asr ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
RE: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
> > If I read the proposal correctly, order would be determined by when the > > key was first encountered. Presumably, that would mean that the related > > value would also be the first encountered (not overridden by later-added > > keys as dictated by your business requirements). > > Hm Guess this means we need a PEP! Or the implementation can have a switch to choose between keep-first logic or replace logic. The latter seems a bit odd to me. The key position would be determined by the first encountered while the value would be determined by the last encountered. Starting with [(10, v1), (20, v2), (10.0, v3)], the ordered dictionary's items would look like [(10, v3), (20, v2)]. Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
Delaney, Timothy C (Timothy) <[EMAIL PROTECTED]> wrote: > Steven Bethard wrote: > > > def filterdups(iterable): > > seen = set() > > for item in iterable: > > if item not in seen: > > seen.add(item) > > yield item > > > > Adding this to, say, itertools would cover all my use cases. And as > > long as you don't have too many duplicates, filterdups as above should > > keep memory consumption down better. > > Thinking about this further - memory usage would be almost identical. By > the time you completed the iterable, you would have built up exactly the > same set internally - although probably not as memory efficient since it > would be being built piecemeal. OTOH, an ordered set has a bit of extra > memory for maintaining the order, so it's going to be pretty close. Definitely true that if you iterate through your entire iterable, it doesn't gain you anything over an OrderedSet. OTOH, if you only end up looking at the first N elements of the iterable, then this would be more efficient than putting the entire iterable into an OrderedSet and taking the first N from there. Of course you can put only the first elements into the OrderedSet, but note that you can't just put in the first N; you have to add elements of the iterable into the OrderedSet until it has len(obj) == N. Not that this should be more than a few lines of code, but it's code that you wouldn't have to write with fitlerdups. That said, you're right of course that filterdups is certainly not a big win over OrderedSet. I was really just trying to point out that if we were just trying to provide a solution to the filtering-duplicates-while-maintaining-order problem that OrderedSet wasn't the only path to that goal. I think since then there have been a number of other reasonable cases suggested for wanting an OrderedSet, e.g.: * A DB result set that is indexed by a key but also maintains row order [Eli Stevens, Jeff Bauer] * A config-file data structure that indexes by option names but maintains the order of elements read from a config file [John Williams] * Drop-down field selectors indexed by both name and sequence position [Jeff Bauer] So I'm now probably +0.5 on adding an OrderedSet to collections. Not that my opinion is particularly influential, of course. ;-) Steve -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
RE: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
Steven Bethard wrote: > def filterdups(iterable): > seen = set() > for item in iterable: > if item not in seen: > seen.add(item) > yield item > > Adding this to, say, itertools would cover all my use cases. And as > long as you don't have too many duplicates, filterdups as above should > keep memory consumption down better. Thinking about this further - memory usage would be almost identical. By the time you completed the iterable, you would have built up exactly the same set internally - although probably not as memory efficient since it would be being built piecemeal. OTOH, an ordered set has a bit of extra memory for maintaining the order, so it's going to be pretty close. The only thing this gains you (and it's significant) is the ability to work on any iterable lazily. Tim Delaney ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
RE: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
Jeff Bauer wrote: > I'm not specifically lobbying for its inclusion in > stdlib, but I often find an ordered dict useful when I > want both ordered and random access, e.g. situations: > >- database table fields/attributes >- drop down field selectors Yep - these are also cases that are familiar to me - it's the type of thing you don't think about until you actually need it. > In both cases, I could get by with other techniques, but I > would use stdlib ordered dicts if they were available. > I also prefer the term "ordered dict" to LinkedHashXXX. You may notice the subject is LinkedHashXXX *equivalents* ;) There is no way I would ever propose adding them with those names. OTOH, "ordered set" and "ordered dict" implies different things to different people - usually "sorted" rather than "the order things were put in". Perhaps "temporally-ordered" ;) BTW, just to clarify the semantics: Set: Items are iterated over in the order that they are added. Adding an item that compares equal to one that is already in the set does not replace the item already in the set, and does not change the iteration order. Removing an item, then re-adding it moves the item to the end of the iteration order. Dict: Keys are iterated over in the order that they are added. Setting a value using a key that compares equal to one already in the dict replaces the value, but not the key, and does not change the iteration order. Removing a key (and value) then re-adding it moves the key to the end of the iteration order. Tim Delaney ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
Thomas Heller <[EMAIL PROTECTED]> wrote: > Steven Bethard <[EMAIL PROTECTED]> writes: > >> What use do you have for it other than filtering >> duplicates from a list while retaining order? > > If this were the only use case, you are right. I cannot > remember the use case I once had right now, and probably > the code has been rewritten anyway. But I assume use > cases exist, otherwise there weren't so many recipes > for the ordered dictionary. I'm not specifically lobbying for its inclusion in stdlib, but I often find an ordered dict useful when I want both ordered and random access, e.g. situations: - database table fields/attributes - drop down field selectors In both cases, I could get by with other techniques, but I would use stdlib ordered dicts if they were available. I also prefer the term "ordered dict" to LinkedHashXXX. Jeff Bauer Rubicon, Inc. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
On Wed, Mar 09, 2005, Raymond Hettinger wrote: > [Aahz] >> >> Gee, I just found out I could have used an OrderedDict today. (We're >> using a dict that we're now having to add an auxilliary list to to track >> when keys are added.) (This isn't particularly an argument in favor of >> adding OrderedDict to stdlib, but it's another use case. Each dict key >> contains a dict value; the subkeys from later-added keys are supposed to >> override earlier subkeys. The original implementation relied on subkeys >> being unique, but that doesn't work for our new business requirements.) > > If I read the proposal correctly, order would be determined by when the > key was first encountered. Presumably, that would mean that the related > value would also be the first encountered (not overridden by later-added > keys as dictated by your business requirements). Hm Guess this means we need a PEP! -- Aahz ([EMAIL PROTECTED]) <*> http://www.pythoncraft.com/ "The joy of coding Python should be in seeing short, concise, readable classes that express a lot of action in a small amount of clear code -- not in reams of trivial code that bores the reader to death." --GvR ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
RE: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
[Aahz] > Gee, I just found out I could have used an OrderedDict today. (We're > using a dict that we're now having to add an auxilliary list to to track > when keys are added.) (This isn't particularly an argument in favor of > adding OrderedDict to stdlib, but it's another use case. Each dict key > contains a dict value; the subkeys from later-added keys are supposed to > override earlier subkeys. The original implementation relied on subkeys > being unique, but that doesn't work for our new business requirements.) If I read the proposal correctly, order would be determined by when the key was first encountered. Presumably, that would mean that the related value would also be the first encountered (not overridden by later-added keys as dictated by your business requirements). Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
On Wed, Mar 09, 2005, Tommy Burnette wrote: > > I'd say I'm +0. fwiw- I've been using a locally-rolled OrderedDict > implementation for the last 5-6 years in which insertion order is the > only order respected. I use it all over the place (in a code base of > ~60k lines of python code). > > so there's another use case for you. bust as you say, easy to do > yourself... Gee, I just found out I could have used an OrderedDict today. (We're using a dict that we're now having to add an auxilliary list to to track when keys are added.) (This isn't particularly an argument in favor of adding OrderedDict to stdlib, but it's another use case. Each dict key contains a dict value; the subkeys from later-added keys are supposed to override earlier subkeys. The original implementation relied on subkeys being unique, but that doesn't work for our new business requirements.) -- Aahz ([EMAIL PROTECTED]) <*> http://www.pythoncraft.com/ "The joy of coding Python should be in seeing short, concise, readable classes that express a lot of action in a small amount of clear code -- not in reams of trivial code that bores the reader to death." --GvR ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
I'd say I'm +0. fwiw- I've been using a locally-rolled OrderedDict implementation for the last 5-6 years in which insertion order is the only order respected. I use it all over the place (in a code base of ~60k lines of python code). so there's another use case for you. bust as you say, easy to do yourself... =?ISO-8859-1?Q?"Martin_v._L=F6wis"?= writes: | Thomas Heller wrote: | > I cannot understand why people are against adding it to stdlib (after | > the name, the implementation, and the exact place have been decided). | > It's certainly a useful data type, isn't it? | | It depends on the precise semantics. You often want a dictionary where | the keys come out in some order - but that is rarely the order in which | they were added to the dictionary. Most frequently, you want the keys | sorted, according to some criteria. If not that, I would assume that you | typically have the order of keys determined before even filling the | dictionary, in which case you can do | | for k in keys_in_preferred_order: | v = hashtable[k] | process(k,v) | | I remember having needed that once in the past 15 years (in Smalltalk | at the time), so I wrote an OrderedDictionary for Smalltalk/V (which | didn't have it). It took me an hour or so. | | I don't recall what precisely it was that I needed it for, and I cannot | think of any use case for the data type right now. | | So I'm -0 on adding the data type: I have a vague feeling it is needed, | but rarely, and I don't know precisely what for. | | Regards, | Martin | ___ | Python-Dev mailing list | Python-Dev@python.org | http://mail.python.org/mailman/listinfo/python-dev | Unsubscribe: http://mail.python.org/mailman/options/python-dev/tommy%40ilm.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
Steven Bethard wrote: Thomas Heller <[EMAIL PROTECTED]> wrote: [About an ordered dictionary] Well, that was basically the question I posed. So far I've seen only one use for it, and that one is better served by adding a function to itertools. What use do you have for it other than filtering duplicates from a list while retaining order? The primary use case I have deals with DB result sets*. The ordering of the rows returned from a query is important, so keeping the iteration order is nice. Most of the tables I deal with have keys of some kind, and being able to pull out a result row by key is also nice. Granted, I rarely use /both/ at the same time, but it is nice to not have to specify how the result set will be used when I retrieve it. To me, this seems to be the same concept as the config file parsing previously mentioned. I don't feel qualified to have an opinion** about inclusion in the stdlib, much less vote. relurkin'ly yrs, Eli [*] - In my case, it's actually coded in Java, for work. There might be a reason that this problem isn't language-generic, but the 1.5 minutes I spent thinking about it were not illuminating. [**] - Yet I have one anyway. This kind of datatype seems one of those easy-to-get-half-right things that could benefit from a solid implementation. It also doesn't strike me as controversial in terms of API or internal structure. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
John Williams wrote: The other use case I have is for dealing with data where the iteration order doesn't matter to the program but it does matter to users. For instance, consider the ConfigParser's write method. Any ordering of values in the output is functionally equivalent, but the original data is likely to have come from a file that was arranged in some meaningful order, and it would be nice to preserve that order, especially if it can be done with no extra effort. That is a good example, IMO. But then, in the specific case, the dictionary for each section is created deeply inside ConfigParser, with the lines cursect = {'__name__': sectname} self._sections[sectname] = cursect So this uses a builtin dict, and there is no easy way to override it for the application. Of course, given your reasoning, ConfigParser *should* use an OrderedDictionary (probably both for cursect and for self._sections), in which case this would all be transparent to the application. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
Steven Bethard wrote: > Thomas Heller <[EMAIL PROTECTED]> wrote: > >> [About an ordered dictionary] > > > Well, that was basically the question I posed. So far I've seen only > one use for it, and that one is better served by adding a function to > itertools. What use do you have for it other than filtering > duplicates from a list while retaining order? > > Steve Using a LinkedHashMap generally cuts down in the amount of apparent randomness in a program. This is especially helpful when it comes time to debug a really complicated program by diffing log files, since it prevents slightly different maps from having wildly different iteration orders. Often using a plain HashMap can introduce enough randomness to make two otherwise similar log files nearly impossible to compare. The other use case I have is for dealing with data where the iteration order doesn't matter to the program but it does matter to users. For instance, consider the ConfigParser's write method. Any ordering of values in the output is functionally equivalent, but the original data is likely to have come from a file that was arranged in some meaningful order, and it would be nice to preserve that order, especially if it can be done with no extra effort. --jw ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
James Y Knight wrote: I use ordered dictionaries for testing. With an ordered dict I can string compare the output of my program to what is expected. Without an ordered dict, I'd have to re-parse the output and order it, which would require some complicated code that's just as likely to be wrong as the code I'm trying to test. I see. I would argue that you were better off if the test cases were sorted (according to some total, stable-across-releases, order), rather than being ordered. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
Thomas Heller wrote: I cannot understand why people are against adding it to stdlib (after the name, the implementation, and the exact place have been decided). It's certainly a useful data type, isn't it? It depends on the precise semantics. You often want a dictionary where the keys come out in some order - but that is rarely the order in which they were added to the dictionary. Most frequently, you want the keys sorted, according to some criteria. If not that, I would assume that you typically have the order of keys determined before even filling the dictionary, in which case you can do for k in keys_in_preferred_order: v = hashtable[k] process(k,v) I remember having needed that once in the past 15 years (in Smalltalk at the time), so I wrote an OrderedDictionary for Smalltalk/V (which didn't have it). It took me an hour or so. I don't recall what precisely it was that I needed it for, and I cannot think of any use case for the data type right now. So I'm -0 on adding the data type: I have a vague feeling it is needed, but rarely, and I don't know precisely what for. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
On Mar 9, 2005, at 4:19 PM, Thomas Heller wrote: If this were the only use case, you are right. I cannot remember the use case I once had right now, and probably the code has been rewritten anyway. But I assume use cases exist, otherwise there weren't so many recipes for the ordered dictionary. I use ordered dictionaries for testing. With an ordered dict I can string compare the output of my program to what is expected. Without an ordered dict, I'd have to re-parse the output and order it, which would require some complicated code that's just as likely to be wrong as the code I'm trying to test. James ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
Thomas Heller <[EMAIL PROTECTED]> wrote: > Steven Bethard <[EMAIL PROTECTED]> writes: > > > What use do you have for it other than filtering > > duplicates from a list while retaining order? > > If this were the only use case, you are right. I cannot remember the > use case I once had right now, and probably the code has been rewritten > anyway. But I assume use cases exist, otherwise there weren't so many > recipes for the ordered dictionary. Sorry, I didn't mean to come across as suggesting that there weren't other use cases for it. I'm only asking for people who know these other use cases to present them here. At the moment, we have only one use case, and for that use case, OrderedDict/OrderedSet isn't really the best solution. (That's all my code was intended to point out -- I don't really care if it makes it into itertools or not.) If people could present a few reasonable problems where OrderedDict/OrderedSet really do provide the best solutions, and then make the case that these use cases are reasonably frequent, I think there would be much better support for adding such types to the standard library. Steve -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
Steven Bethard <[EMAIL PROTECTED]> writes: > Thomas Heller <[EMAIL PROTECTED]> wrote: >> [About an ordered dictionary] > [snip] >> I cannot understand why people are against adding it to stdlib (after >> the name, the implementation, and the exact place have been decided). >> It's certainly a useful data type, isn't it? > > Well, that was basically the question I posed. So far I've seen only > one use for it, and that one is better served by adding a function to > itertools. Hm, removing duplicates from a list is an algorithm, not a data structure. And the code you posted (no offense intended) is, also imo, faster written by an experienced programmer than located in some module. OTOH, I see no problem adding it to itertools. > What use do you have for it other than filtering > duplicates from a list while retaining order? If this were the only use case, you are right. I cannot remember the use case I once had right now, and probably the code has been rewritten anyway. But I assume use cases exist, otherwise there weren't so many recipes for the ordered dictionary. Thomas ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
Thomas Heller <[EMAIL PROTECTED]> wrote: > [About an ordered dictionary] [snip] > I cannot understand why people are against adding it to stdlib (after > the name, the implementation, and the exact place have been decided). > It's certainly a useful data type, isn't it? Well, that was basically the question I posed. So far I've seen only one use for it, and that one is better served by adding a function to itertools. What use do you have for it other than filtering duplicates from a list while retaining order? Steve -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
[About an ordered dictionary] > Delaney, Timothy C (Timothy) wrote: >> Does anyone else think it would be worthwhile adding these to >> collections, or should I just make a cookbook entry? Greg Ward <[EMAIL PROTECTED]> writes: > +1 on a cookbook entry. +0 on adding to stdlib. "Martin v. Löwis" <[EMAIL PROTECTED]> writes: > -0 for the addition to the collections module, -1 on the specific > name. I cannot understand why people are against adding it to stdlib (after the name, the implementation, and the exact place have been decided). It's certainly a useful data type, isn't it? Thomas ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
Delaney, Timothy C (Timothy) wrote: Does anyone else think it would be worthwhile adding these to collections, or should I just make a cookbook entry? -0 for the addition to the collections module, -1 on the specific name. Java made a *big* mistake by putting "Hash" into the names of these things. From the outside, it is primarily a Dictionary; only when you look closer you see that this specific dictionary requires hashable keys (as opposed to, say, the C++ std::map, which requires sortable keys). So consequently, the data type should be called OrderedDictionary, and its cookbook recipe is http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/107747 Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
RE: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
Greg Ward wrote: > I'll attach another approach to the same problem, an ordered > dictionary object. I believe the semantics of > adding/readding/deleting keys is the same as java.util.LinkedHashMap > -- certainly it seems the most sensible and easy-to-implement > semantics. That's essentially the same approach I used - I didn't get around to properly implementing a doubly-linked list between the keys - I just maintained a list with the correct order. If I'm going to write a Cookbook recipe though I should probably do it properly ... Personally, I think it would be quite nice if all python dictionaries had these semantics - it would be nice to be able to say that items will be returned in the order they're inserted - but I wouldn't want to incur the additional cost in such a fundamental data structure. One thing I did notice - dict.__repr__ and dict.__str__ don't produce strings in the iteration order of subclasses - they always use the arbitrary dict iteration order. Is this intentional? Tim Delaney ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
On 09 March 2005, Delaney, Timothy C (Timothy) said: > For those who don't know, LinkedHashSet and LinkedHashMap are simply > hashed sets and maps that iterate in the order that the keys were added > to the set/map. I almost invariably use them for the above scenario - > removing duplicates without changing order. > > Does anyone else think it would be worthwhile adding these to > collections, or should I just make a cookbook entry? +1 on a cookbook entry. +0 on adding to stdlib. I'll attach another approach to the same problem, an ordered dictionary object. I believe the semantics of adding/readding/deleting keys is the same as java.util.LinkedHashMap -- certainly it seems the most sensible and easy-to-implement semantics. Greg -- Greg Ward <[EMAIL PROTECTED]> http://www.gerg.ca/ I brought my BOWLING BALL -- and some DRUGS!! """ Provides odict, a subclass of the standard dict type that preserves insertion order. """ __revision__ = "$Id: odict.py,v 1.1 2004/03/09 02:32:08 gward Exp $" class odict(dict): """ An order-preserving mapping: lookup works like a dictionary, but keys(), values(), items(), etc. all return items in order of insertion. (Resetting a key preserves order; deleting a key and re-adding it moves it to the end.) Don't use this for enormous dictionaries, since it increases the memory use of the whole object, and the worst-case runtime for many common operations is O(N) rather than O(1). """ def __init__(self, arg=None): super(odict, self).__init__() self.order = [] if isinstance(arg, dict): self.update(arg) def copy(self): return self.__class__(self) def __delitem__(self, key): dict.__delitem__(self, key) self.order.remove(key) def __iter__(self): return iter(self.order) def __setitem__(self, key, value): add = key not in self dict.__setitem__(self, key, value) if add: self.order.append(key) def items(self): return [(key, self[key]) for key in self.order] def keys(self): return self.order def values(self): return [self[key] for key in self.order] def iteritems(self): for key in self.order: yield (key, self[key]) iterkeys = __iter__ def itervalues(self): for key in self.order: yield self[key] def clear(self): dict.clear(self) self.order.clear() def popitem(self): """ Remove and return the most-recently-added item of the mapping. """ key = self.order.pop() value = self[key] dict.__delitem__(self, key) return value def setdefault(self, key, value=None): if key in self: return self[key] else: self[key] = value return value def update(self, other): for key in other: self[key] = other[key] if __name__ == "__main__": m = odict() m['foo'] = 43 m['bar'] = 'barf' m[42] = None print m.items() ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
Steven Bethard wrote: Delaney, Timothy C (Timothy) <[EMAIL PROTECTED]> wrote: The perennial "how do I remove duplicates from a list" topic came up on c.l.py and in the discussion I mentioned the java 1.5 LinkedHashSet and LinkedHashMap. I'd thought about proposing these before, but couldn't think of where to put them. It was pointed out that the obvious place would be the collections module. For those who don't know, LinkedHashSet and LinkedHashMap are simply hashed sets and maps that iterate in the order that the keys were added to the set/map. I almost invariably use them for the above scenario - removing duplicates without changing order. Does anyone else think it would be worthwhile adding these to collections, or should I just make a cookbook entry? I guess I'm -0 on this. Though I was the one that suggested that collections is the right place to put them, I'm not really certain how much we gain by including them. I too would only ever use them for removing duplicates from a list. But if we're trying to provide a solution to this problem, I'd rather see an iterable-friendly one. See a previous thread on this issue[1] where I suggest something like: def filterdups(iterable): seen = set() for item in iterable: if item not in seen: seen.add(item) yield item Adding this to, say, itertools would cover all my use cases. And as long as you don't have too many duplicates, filterdups as above should keep memory consumption down better. I am -1 on adding LinkedHash*. While I can understand wanting to get rid of duplicates easily and wanting a good solutoin, Steven's snippet of code shows rolling your own solution is easy. Plus this can even be simplified down to a one-liner using itertools already:: itertools.ifilterfalse(lambda item, _set=set(): (item in _set) or (_set.add(item) and False), iterable) I don't think it is the prettiest solution, but it does show that coming up with one is not hard nor restricted to only a single solution that requires knowing some Python idiom (well, mine does for the default arg to the lambda, but Steven's doesn't). The last thing I want to happen is for Python's stdlib to grow every possible data structure out there like Java seems to have. I don't ever want to have to think about what *variant* of a data structure I should use, only what *type* of data structure (and no, I don't count collections.deque and Queue.Queue an overlap since the latter is meant more for thread communication, at least in my mind). -Brett ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents
Delaney, Timothy C (Timothy) <[EMAIL PROTECTED]> wrote: > The perennial "how do I remove duplicates from a list" topic came up on > c.l.py and in the discussion I mentioned the java 1.5 LinkedHashSet and > LinkedHashMap. I'd thought about proposing these before, but couldn't > think of where to put them. It was pointed out that the obvious place > would be the collections module. > > For those who don't know, LinkedHashSet and LinkedHashMap are simply > hashed sets and maps that iterate in the order that the keys were added > to the set/map. I almost invariably use them for the above scenario - > removing duplicates without changing order. > > Does anyone else think it would be worthwhile adding these to > collections, or should I just make a cookbook entry? I guess I'm -0 on this. Though I was the one that suggested that collections is the right place to put them, I'm not really certain how much we gain by including them. I too would only ever use them for removing duplicates from a list. But if we're trying to provide a solution to this problem, I'd rather see an iterable-friendly one. See a previous thread on this issue[1] where I suggest something like: def filterdups(iterable): seen = set() for item in iterable: if item not in seen: seen.add(item) yield item Adding this to, say, itertools would cover all my use cases. And as long as you don't have too many duplicates, filterdups as above should keep memory consumption down better. On the other hand, if someone could give me a few other reasonable use cases for LinkedHashSet and LinkedHashMap, I wouldn't object to their inclusion. Steve [1] http://mail.python.org/pipermail/python-list/2005-February/264179.html -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] LinkedHashSet/LinkedHashMap equivalents
The perennial "how do I remove duplicates from a list" topic came up on c.l.py and in the discussion I mentioned the java 1.5 LinkedHashSet and LinkedHashMap. I'd thought about proposing these before, but couldn't think of where to put them. It was pointed out that the obvious place would be the collections module. For those who don't know, LinkedHashSet and LinkedHashMap are simply hashed sets and maps that iterate in the order that the keys were added to the set/map. I almost invariably use them for the above scenario - removing duplicates without changing order. Does anyone else think it would be worthwhile adding these to collections, or should I just make a cookbook entry? Cheers. Tim Delaney ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com