Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-11 Thread Nick Coghlan
Anthony Baxter wrote:
On Thursday 10 March 2005 17:29, Raymond Hettinger wrote:
Or the implementation can have a switch to choose between keep-first
logic or replace logic.
The latter seems a bit odd to me.  The key position would be determined
by the first encountered while the value would be determined by the last
encountered.  Starting with [(10, v1), (20, v2), (10.0, v3)], the
ordered dictionary's items would look like [(10, v3), (20, v2)].

Or, alternately, we keep the last occurence, and move it to the new position.
There's a wide number of different use cases, each with a slightly different
final result, and for this reason alone I'm -0 on it for the library. 

Anthony
But surely such use cases could be more easily handled by subclassing from 
collections.OrderedDict and tweaking the semantics than by having to implement 
an ordered mapping from scratch.

Would the default semantics below really be that suprising?
"An ordered dictionary remembers the order in which keys are first seen and, 
when iterated over, returns entries based on that order. This applies to direct 
iteration, iteration over values and (key, value) pairs, to the list-producing 
methods (i.e. keys(), values() and items()) and to any other operations that 
involve implicit iteration (e.g. converting to a string representation). 
Overwriting an entry replaces its value, but does not affect its position in the 
key order. Removing an entry (using 'del') _does_ remove it from the key order. 
Accordingly, if the entry is later recreated, it will then occur last in the key 
order.

This behaviour is analagous to that of a list constructed using only 
list.append() to add items (indeed, the key order can be thought of as a list 
constructed in this fashion, with keys appended to the list when they are first 
encountered).

An ordered dictionary provides a sort() method. The sort operation is applied to 
the key ordering and affects future iteration over the dictionary. Again, this 
is analagous to sorting a list."

For instance, to convert a standard dictionary to an ordered dictionary using a 
specific key function:

  x = collections.OrderedDict(sorted(d.itervalues(), key=keyfunc))
Cheers,
Nick.
--
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
http://boredomandlaziness.skystorm.net
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-10 Thread Josiah Carlson

"Raymond Hettinger" <[EMAIL PROTECTED]> wrote:
> 
> [BJörn Lindqvist]
> > I would LOVE for **kwargs to be an ordered dict. It would allow me to
> > write code like this:
> > 
> > .class MyTuple:
> > .def __init__(self, **kwargs):
> > .self.__dict__ = ordereddict(kwargs)
> 
> This doesn't work.  The kwargs are already turned into a regular
> dictionary before ordereddict sees it.

From what I understand, he was saying that it would be nice if kwargs
were an ordered dict /instead of/ a standard dict.

Whether or not he realizes it will not happen due to the 2x memory
overhead, 2x speed hit, etc., every time kwargs are used, is another
matter.

Alternatively, BJorn could use a list of tuples and *args to preserve
order, but that is an off-list discussion for another day.

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


RE: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-10 Thread Raymond Hettinger
[BJörn Lindqvist]
> I would LOVE for **kwargs to be an ordered dict. It would allow me to
> write code like this:
> 
> .class MyTuple:
> .def __init__(self, **kwargs):
> .self.__dict__ = ordereddict(kwargs)

This doesn't work.  The kwargs are already turned into a regular
dictionary before ordereddict sees it.


Raymond Hettinger

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-10 Thread Anthony Baxter
On Thursday 10 March 2005 17:29, Raymond Hettinger wrote:
> Or the implementation can have a switch to choose between keep-first
> logic or replace logic.
>
> The latter seems a bit odd to me.  The key position would be determined
> by the first encountered while the value would be determined by the last
> encountered.  Starting with [(10, v1), (20, v2), (10.0, v3)], the
> ordered dictionary's items would look like [(10, v3), (20, v2)].

Or, alternately, we keep the last occurence, and move it to the new position.
There's a wide number of different use cases, each with a slightly different
final result, and for this reason alone I'm -0 on it for the library. 

Anthony

-- 
Anthony Baxter <[EMAIL PROTECTED]>
It's never too late to have a happy childhood.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-10 Thread BJörn Lindqvist
I would LOVE for **kwargs to be an ordered dict. It would allow me to
write code like this:

.class MyTuple:
.def __init__(self, **kwargs):
.self.__dict__ = ordereddict(kwargs)
.
.def __iter__(self):
.for k, v in self.__dict__.items():
.yield v
.
.t = MyTuple(r = 99, g = 12, b = 4)
.r, g, b = t
.print r, g, b

I know it goes beyond the original intention of the proposal, but I
figure I'd mention it anyway because the unorder of **kwargs has been
bugging me alot.

-- 
mvh Björn
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


RE: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-10 Thread Barry Warsaw
On Thu, 2005-03-10 at 01:29, Raymond Hettinger wrote:

> Or the implementation can have a switch to choose between keep-first
> logic or replace logic.

This is what I meant by my previous follow up: while the concept of "an
ordered dictionary" is nice and seemingly generic enough, in practice I
suspect that exact semantics and other design factors will either tend
to push the stdlib implementation into ever more complexity, or won't
prevent people from continuing to roll their own because the stdlib
version "isn't quite right".

-Barry



signature.asc
Description: This is a digitally signed message part
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-10 Thread Barry Warsaw
On Wed, 2005-03-09 at 19:39, Tommy Burnette wrote:
> I'd say I'm +0.  fwiw- I've been using a locally-rolled OrderedDict
> implementation for the last 5-6 years in which insertion order is the
> only order respected.  I use it all over the place (in a code base of
> ~60k lines of python code).
> 
> so there's another use case for you.  bust as you say, easy to do
> yourself... 

I'm -0 on adding it to the stdlib, but mostly because I don't like the
name, and I suspect it's going to be one of those nuggets lurking in the
standard library that few people will use, tending either to overlook it
or just roll their own anyway because the standard one doesn't have
quite the right semantics.

FWIW, email.Message.Message /exposes/ an ordered dictionary-like
interface even though it's implemented as a simple list.  It was
considered at the time that the number of headers in an email message
wouldn't be so large that anything else would be worth the complexity. 
I think that still holds, for the original uses cases at least.

-Barry



signature.asc
Description: This is a digitally signed message part
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-10 Thread Nick Coghlan
Delaney, Timothy C (Timothy) wrote:
OTOH, "ordered set" and "ordered dict" implies different things to
different people - usually "sorted" rather than "the order things were
put in". Perhaps "temporally-ordered" ;)
OTGH*, I would expect an OrderedDict / OrderedSet to have 'add to the end' 
semantics, but also provide a 'sort()' method so that the ordering could be 
changed at a later date.

IOW, by default the ordering is temporal. Sorting the ordered dict/set changes 
the iteration order for the current contents. Further additions are still added 
in temporal order until such time as the dict/set is sorted again.

The parallels are to using list.append() to build a list, and list.sort() to 
order the current contents (in fact, a simplistic approach could use that exact 
technique to remember the order of keys, at the cost of doubling key storage 
requirements).

Cheers,
Nick.
*OTGH: On the gripping hand :)
--
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
http://boredomandlaziness.skystorm.net
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-10 Thread Michael Hudson
"Delaney, Timothy C (Timothy)" <[EMAIL PROTECTED]> writes:

> Set: Items are iterated over in the order that they are added. Adding an
> item that compares equal to one that is already in the set does not
> replace the item already in the set, and does not change the iteration
> order. Removing an item, then re-adding it moves the item to the end of
> the iteration order.

Well, this could be satisfied by an append_new operation on lists,
right (thinking of Common Lisps #'cl:pushnew)?  Complexity not that
great, of course, but I've written code like:

if a not in l:
l.append(a)

and not suffered that badly for it before now...

> Dict: Keys are iterated over in the order that they are added. Setting a
> value using a key that compares equal to one already in the dict
> replaces the value, but not the key, and does not change the iteration
> order. Removing a key (and value) then re-adding it moves the key to the
> end of the iteration order.

And these are what CL types call association lists, in effect.

Cheers,
mwh

-- 
  #ifndef P_tmpdir
  printf( "Go buy a better computer" );
  exit( ETHESKYISFALLINGANDIWANTMYMAMA );
 -- Dimitri Maziuk on writing secure code, asr
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


RE: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Raymond Hettinger
> > If I read the proposal correctly, order would be determined by when
the
> > key was first encountered.  Presumably, that would mean that the
related
> > value would also be the first encountered (not overridden by
later-added
> > keys as dictated by your business requirements).
> 
> Hm  Guess this means we need a PEP!

Or the implementation can have a switch to choose between keep-first
logic or replace logic.

The latter seems a bit odd to me.  The key position would be determined
by the first encountered while the value would be determined by the last
encountered.  Starting with [(10, v1), (20, v2), (10.0, v3)], the
ordered dictionary's items would look like [(10, v3), (20, v2)].  


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Steven Bethard
Delaney, Timothy C (Timothy) <[EMAIL PROTECTED]> wrote:
> Steven Bethard wrote:
> 
> > def filterdups(iterable):
> >  seen = set()
> >  for item in iterable:
> >  if item not in seen:
> >  seen.add(item)
> >  yield item
> >
> > Adding this to, say, itertools would cover all my use cases.  And as
> > long as you don't have too many duplicates, filterdups as above should
> > keep memory consumption down better.
> 
> Thinking about this further - memory usage would be almost identical. By
> the time you completed the iterable, you would have built up exactly the
> same set internally - although probably not as memory efficient since it
> would be being built piecemeal. OTOH, an ordered set has a bit of extra
> memory for maintaining the order, so it's going to be pretty close.

Definitely true that if you iterate through your entire iterable, it
doesn't gain you anything over an OrderedSet.  OTOH, if you only end
up looking at the first N elements of the iterable, then this would be
more efficient than putting the entire iterable into an OrderedSet and
taking the first N from there.  Of course you can put only the first
elements into the OrderedSet, but note that you can't just put in the
first N; you have to add elements of the iterable into the OrderedSet
until it has len(obj) == N.  Not that this should be more than a few
lines of code, but it's code that you wouldn't have to write with
fitlerdups.

That said, you're right of course that filterdups is certainly not a
big win over OrderedSet.  I was really just trying to point out that
if we were just trying to provide a solution to the
filtering-duplicates-while-maintaining-order problem that OrderedSet
wasn't the only path to that goal.  I think since then there have been
a number of other reasonable cases suggested for wanting an
OrderedSet, e.g.:

* A DB result set that is indexed by a key but also maintains row
order [Eli Stevens, Jeff Bauer]

* A config-file data structure that indexes by option names but
maintains the order of elements read from a config file [John
Williams]

* Drop-down field selectors indexed by both name and sequence position
[Jeff Bauer]

So I'm now probably +0.5 on adding an OrderedSet to collections.  Not
that my opinion is particularly influential, of course. ;-)

Steve
-- 
You can wordify anything if you just verb it.
--- Bucky Katt, Get Fuzzy
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


RE: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Delaney, Timothy C (Timothy)
Steven Bethard wrote:

> def filterdups(iterable):
>  seen = set()
>  for item in iterable:
>  if item not in seen:
>  seen.add(item)
>  yield item
> 
> Adding this to, say, itertools would cover all my use cases.  And as
> long as you don't have too many duplicates, filterdups as above should
> keep memory consumption down better.

Thinking about this further - memory usage would be almost identical. By
the time you completed the iterable, you would have built up exactly the
same set internally - although probably not as memory efficient since it
would be being built piecemeal. OTOH, an ordered set has a bit of extra
memory for maintaining the order, so it's going to be pretty close.

The only thing this gains you (and it's significant) is the ability to
work on any iterable lazily.

Tim Delaney
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


RE: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Delaney, Timothy C (Timothy)
Jeff Bauer wrote:

> I'm not specifically lobbying for its inclusion in
> stdlib, but I often find an ordered dict useful when I
> want both ordered and random access, e.g. situations:
> 
>- database table fields/attributes
>- drop down field selectors

Yep - these are also cases that are familiar to me - it's the type of
thing you don't think about until you actually need it.

> In both cases, I could get by with other techniques, but I
> would use stdlib ordered dicts if they were available.
> I also prefer the term "ordered dict" to LinkedHashXXX.

You may notice the subject is LinkedHashXXX *equivalents* ;) There is no
way I would ever propose adding them with those names.

OTOH, "ordered set" and "ordered dict" implies different things to
different people - usually "sorted" rather than "the order things were
put in". Perhaps "temporally-ordered" ;)

BTW, just to clarify the semantics:

Set: Items are iterated over in the order that they are added. Adding an
item that compares equal to one that is already in the set does not
replace the item already in the set, and does not change the iteration
order. Removing an item, then re-adding it moves the item to the end of
the iteration order.

Dict: Keys are iterated over in the order that they are added. Setting a
value using a key that compares equal to one already in the dict
replaces the value, but not the key, and does not change the iteration
order. Removing a key (and value) then re-adding it moves the key to the
end of the iteration order.

Tim Delaney
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Jeff Bauer
Thomas Heller <[EMAIL PROTECTED]> wrote:
> Steven Bethard <[EMAIL PROTECTED]> writes:
>
>>  What use do you have for it other than filtering
>>  duplicates from a list while retaining order?
>
> If this were the only use case, you are right.  I cannot
> remember the use case I once had right now, and probably
> the code has been rewritten anyway.  But I assume use
> cases exist, otherwise there weren't so many recipes
> for the ordered dictionary.
I'm not specifically lobbying for its inclusion in
stdlib, but I often find an ordered dict useful when I
want both ordered and random access, e.g. situations:
  - database table fields/attributes
  - drop down field selectors
In both cases, I could get by with other techniques, but I
would use stdlib ordered dicts if they were available.
I also prefer the term "ordered dict" to LinkedHashXXX.
Jeff Bauer
Rubicon, Inc.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Aahz
On Wed, Mar 09, 2005, Raymond Hettinger wrote:
> [Aahz]
>>
>> Gee, I just found out I could have used an OrderedDict today.  (We're
>> using a dict that we're now having to add an auxilliary list to to track
>> when keys are added.)  (This isn't particularly an argument in favor of
>> adding OrderedDict to stdlib, but it's another use case.  Each dict key
>> contains a dict value; the subkeys from later-added keys are supposed to
>> override earlier subkeys.  The original implementation relied on subkeys
>> being unique, but that doesn't work for our new business requirements.)
> 
> If I read the proposal correctly, order would be determined by when the
> key was first encountered.  Presumably, that would mean that the related
> value would also be the first encountered (not overridden by later-added
> keys as dictated by your business requirements).  

Hm  Guess this means we need a PEP!
-- 
Aahz ([EMAIL PROTECTED])   <*> http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code -- 
not in reams of trivial code that bores the reader to death."  --GvR
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


RE: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Raymond Hettinger
[Aahz]
> Gee, I just found out I could have used an OrderedDict today.  (We're
> using a dict that we're now having to add an auxilliary list to to
track
> when keys are added.)  (This isn't particularly an argument in favor
of
> adding OrderedDict to stdlib, but it's another use case.  Each dict
key
> contains a dict value; the subkeys from later-added keys are supposed
to
> override earlier subkeys.  The original implementation relied on
subkeys
> being unique, but that doesn't work for our new business
requirements.)

If I read the proposal correctly, order would be determined by when the
key was first encountered.  Presumably, that would mean that the related
value would also be the first encountered (not overridden by later-added
keys as dictated by your business requirements).  


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Aahz
On Wed, Mar 09, 2005, Tommy Burnette wrote:
>
> I'd say I'm +0.  fwiw- I've been using a locally-rolled OrderedDict
> implementation for the last 5-6 years in which insertion order is the
> only order respected.  I use it all over the place (in a code base of
> ~60k lines of python code).
> 
> so there's another use case for you.  bust as you say, easy to do
> yourself... 

Gee, I just found out I could have used an OrderedDict today.  (We're
using a dict that we're now having to add an auxilliary list to to track
when keys are added.)  (This isn't particularly an argument in favor of
adding OrderedDict to stdlib, but it's another use case.  Each dict key
contains a dict value; the subkeys from later-added keys are supposed to
override earlier subkeys.  The original implementation relied on subkeys
being unique, but that doesn't work for our new business requirements.)
-- 
Aahz ([EMAIL PROTECTED])   <*> http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code -- 
not in reams of trivial code that bores the reader to death."  --GvR
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Tommy Burnette

I'd say I'm +0.  fwiw- I've been using a locally-rolled OrderedDict
implementation for the last 5-6 years in which insertion order is the
only order respected.  I use it all over the place (in a code base of
~60k lines of python code).

so there's another use case for you.  bust as you say, easy to do
yourself... 

=?ISO-8859-1?Q?"Martin_v._L=F6wis"?= writes:
| Thomas Heller wrote:
| > I cannot understand why people are against adding it to stdlib (after
| > the name, the implementation, and the exact place have been decided).
| > It's certainly a useful data type, isn't it?
| 
| It depends on the precise semantics. You often want a dictionary where
| the keys come out in some order - but that is rarely the order in which
| they were added to the dictionary. Most frequently, you want the keys
| sorted, according to some criteria. If not that, I would assume that you
| typically have the order of keys determined before even filling the
| dictionary, in which case you can do
| 
| for k in keys_in_preferred_order:
|  v = hashtable[k]
|  process(k,v)
| 
| I remember having needed that once in the past 15 years (in Smalltalk
| at the time), so I wrote an OrderedDictionary for Smalltalk/V (which
| didn't have it). It took me an hour or so.
| 
| I don't recall what precisely it was that I needed it for, and I cannot
| think of any use case for the data type right now.
| 
| So I'm -0 on adding the data type: I have a vague feeling it is needed,
| but rarely, and I don't know precisely what for.
| 
| Regards,
| Martin
| ___
| Python-Dev mailing list
| Python-Dev@python.org
| http://mail.python.org/mailman/listinfo/python-dev
| Unsubscribe: http://mail.python.org/mailman/options/python-dev/tommy%40ilm.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Eli Stevens (WG.c)
Steven Bethard wrote:
Thomas Heller <[EMAIL PROTECTED]> wrote:
[About an ordered dictionary]
Well, that was basically the question I posed.  So far I've seen only
one use for it, and that one is better served by adding a function to
itertools.  What use do you have for it other than filtering
duplicates from a list while retaining order?
The primary use case I have deals with DB result sets*.  The ordering of 
the rows returned from a query is important, so keeping the iteration 
order is nice.  Most of the tables I deal with have keys of some kind, 
and being able to pull out a result row by key is also nice.  Granted, I 
rarely use /both/ at the same time, but it is nice to not have to 
specify how the result set will be used when I retrieve it.

To me, this seems to be the same concept as the config file parsing 
previously mentioned.

I don't feel qualified to have an opinion** about inclusion in the 
stdlib, much less vote.

relurkin'ly yrs,
Eli
[*] - In my case, it's actually coded in Java, for work.  There might be 
a reason that this problem isn't language-generic, but the 1.5 minutes I 
spent thinking about it were not illuminating.

[**] - Yet I have one anyway.  This kind of datatype seems one of those 
easy-to-get-half-right things that could benefit from a solid 
implementation.  It also doesn't strike me as controversial in terms of 
API or internal structure.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Martin v. Löwis
John Williams wrote:
The other use case I have is for dealing with data where the iteration 
order doesn't matter to the program but it does matter to users.  For 
instance, consider the ConfigParser's write method.  Any ordering of 
values in the output is functionally equivalent, but the original data 
is likely to have come from a file that was arranged in some meaningful 
order, and it would be nice to preserve that order, especially if it can 
be done with no extra effort.
That is a good example, IMO. But then, in the specific case, the
dictionary for each section is created deeply inside ConfigParser,
with the lines
cursect = {'__name__': sectname}
self._sections[sectname] = cursect
So this uses a builtin dict, and there is no easy way to override it
for the application.
Of course, given your reasoning, ConfigParser *should* use an
OrderedDictionary (probably both for cursect and for self._sections),
in which case this would all be transparent to the application.
Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread John Williams
Steven Bethard wrote:
> Thomas Heller <[EMAIL PROTECTED]> wrote:
>
>> [About an ordered dictionary]
>
>
> Well, that was basically the question I posed.  So far I've seen only
> one use for it, and that one is better served by adding a function to
> itertools.  What use do you have for it other than filtering
> duplicates from a list while retaining order?
>
> Steve
Using a LinkedHashMap generally cuts down in the amount of apparent 
randomness in a program.  This is especially helpful when it comes time 
to debug a really complicated program by diffing log files, since it 
prevents slightly different maps from having wildly different iteration 
orders.  Often using a plain HashMap can introduce enough randomness to 
make two otherwise similar log files nearly impossible to compare.

The other use case I have is for dealing with data where the iteration 
order doesn't matter to the program but it does matter to users.  For 
instance, consider the ConfigParser's write method.  Any ordering of 
values in the output is functionally equivalent, but the original data 
is likely to have come from a file that was arranged in some meaningful 
order, and it would be nice to preserve that order, especially if it can 
be done with no extra effort.

--jw
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Martin v. Löwis
James Y Knight wrote:
I use ordered dictionaries for testing. With an ordered dict I can 
string compare the output of my program to what is expected. Without an 
ordered dict, I'd have to re-parse the output and order it, which would 
require some complicated code that's just as likely to be wrong as the 
code I'm trying to test.
I see. I would argue that you were better off if the test cases were 
sorted (according to some total, stable-across-releases, order), rather
than being ordered.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Martin v. Löwis
Thomas Heller wrote:
I cannot understand why people are against adding it to stdlib (after
the name, the implementation, and the exact place have been decided).
It's certainly a useful data type, isn't it?
It depends on the precise semantics. You often want a dictionary where
the keys come out in some order - but that is rarely the order in which
they were added to the dictionary. Most frequently, you want the keys
sorted, according to some criteria. If not that, I would assume that you
typically have the order of keys determined before even filling the
dictionary, in which case you can do
for k in keys_in_preferred_order:
v = hashtable[k]
process(k,v)
I remember having needed that once in the past 15 years (in Smalltalk
at the time), so I wrote an OrderedDictionary for Smalltalk/V (which
didn't have it). It took me an hour or so.
I don't recall what precisely it was that I needed it for, and I cannot
think of any use case for the data type right now.
So I'm -0 on adding the data type: I have a vague feeling it is needed,
but rarely, and I don't know precisely what for.
Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread James Y Knight
On Mar 9, 2005, at 4:19 PM, Thomas Heller wrote:
If this were the only use case, you are right.  I cannot remember the
use case I once had right now, and probably the code has been rewritten
anyway.  But I assume use cases exist, otherwise there weren't so many
recipes for the ordered dictionary.
I use ordered dictionaries for testing. With an ordered dict I can 
string compare the output of my program to what is expected. Without an 
ordered dict, I'd have to re-parse the output and order it, which would 
require some complicated code that's just as likely to be wrong as the 
code I'm trying to test.

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Steven Bethard
Thomas Heller <[EMAIL PROTECTED]> wrote:
> Steven Bethard <[EMAIL PROTECTED]> writes:
> 
> >  What use do you have for it other than filtering
> > duplicates from a list while retaining order?
> 
> If this were the only use case, you are right.  I cannot remember the
> use case I once had right now, and probably the code has been rewritten
> anyway.  But I assume use cases exist, otherwise there weren't so many
> recipes for the ordered dictionary.

Sorry, I didn't mean to come across as suggesting that there weren't
other use cases for it.  I'm only asking for people who know these
other use cases to present them here.

At the moment, we have only one use case, and for that use case,
OrderedDict/OrderedSet isn't really the best solution.  (That's all my
code was intended to point out -- I don't really care if it makes it
into itertools or not.)  If people could present a few reasonable
problems where OrderedDict/OrderedSet really do provide the best
solutions, and then make the case that these use cases are reasonably
frequent, I think there would be much better support for adding such
types to the standard library.

Steve
-- 
You can wordify anything if you just verb it.
--- Bucky Katt, Get Fuzzy
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Thomas Heller
Steven Bethard <[EMAIL PROTECTED]> writes:

> Thomas Heller <[EMAIL PROTECTED]> wrote:
>> [About an ordered dictionary]
> [snip] 
>> I cannot understand why people are against adding it to stdlib (after
>> the name, the implementation, and the exact place have been decided).
>> It's certainly a useful data type, isn't it?
>
> Well, that was basically the question I posed.  So far I've seen only
> one use for it, and that one is better served by adding a function to
> itertools.

Hm, removing duplicates from a list is an algorithm, not a data
structure.  And the code you posted (no offense intended) is, also imo,
faster written by an experienced programmer than located in some module.
OTOH, I see no problem adding it to itertools.

>  What use do you have for it other than filtering
> duplicates from a list while retaining order?

If this were the only use case, you are right.  I cannot remember the
use case I once had right now, and probably the code has been rewritten
anyway.  But I assume use cases exist, otherwise there weren't so many
recipes for the ordered dictionary.

Thomas

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Steven Bethard
Thomas Heller <[EMAIL PROTECTED]> wrote:
> [About an ordered dictionary]
[snip] 
> I cannot understand why people are against adding it to stdlib (after
> the name, the implementation, and the exact place have been decided).
> It's certainly a useful data type, isn't it?

Well, that was basically the question I posed.  So far I've seen only
one use for it, and that one is better served by adding a function to
itertools.  What use do you have for it other than filtering
duplicates from a list while retaining order?

Steve
-- 
You can wordify anything if you just verb it.
--- Bucky Katt, Get Fuzzy
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Thomas Heller
[About an ordered dictionary]

> Delaney, Timothy C (Timothy) wrote:

>> Does anyone else think it would be worthwhile adding these to
>> collections, or should I just make a cookbook entry?

Greg Ward <[EMAIL PROTECTED]> writes:
> +1 on a cookbook entry.  +0 on adding to stdlib.

"Martin v. Löwis" <[EMAIL PROTECTED]> writes:

> -0 for the addition to the collections module, -1 on the specific
> name.

I cannot understand why people are against adding it to stdlib (after
the name, the implementation, and the exact place have been decided).
It's certainly a useful data type, isn't it?

Thomas

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Martin v. Löwis
Delaney, Timothy C (Timothy) wrote:
Does anyone else think it would be worthwhile adding these to
collections, or should I just make a cookbook entry?
-0 for the addition to the collections module, -1 on the specific
name.
Java made a *big* mistake by putting "Hash" into the names of these
things. From the outside, it is primarily a Dictionary; only when
you look closer you see that this specific dictionary requires
hashable keys (as opposed to, say, the C++ std::map, which requires
sortable keys).
So consequently, the data type should be called OrderedDictionary,
and its cookbook recipe is
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/107747
Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


RE: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-08 Thread Delaney, Timothy C (Timothy)
Greg Ward wrote:

> I'll attach another approach to the same problem, an ordered
> dictionary object.  I believe the semantics of
> adding/readding/deleting keys is the same as java.util.LinkedHashMap
> -- certainly it seems the most sensible and easy-to-implement
> semantics. 

That's essentially the same approach I used - I didn't get around to
properly implementing a doubly-linked list between the keys - I just
maintained a list with the correct order. If I'm going to write a
Cookbook recipe though I should probably do it properly ...

Personally, I think it would be quite nice if all python dictionaries
had these semantics - it would be nice to be able to say that items will
be returned in the order they're inserted - but I wouldn't want to incur
the additional cost in such a fundamental data structure.

One thing I did notice - dict.__repr__ and dict.__str__ don't produce
strings in the iteration order of subclasses - they always use the
arbitrary dict iteration order. Is this intentional?

Tim Delaney
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-08 Thread Greg Ward
On 09 March 2005, Delaney, Timothy C (Timothy) said:
> For those who don't know, LinkedHashSet and LinkedHashMap are simply
> hashed sets and maps that iterate in the order that the keys were added
> to the set/map. I almost invariably use them for the above scenario -
> removing duplicates without changing order.
> 
> Does anyone else think it would be worthwhile adding these to
> collections, or should I just make a cookbook entry?

+1 on a cookbook entry.  +0 on adding to stdlib.

I'll attach another approach to the same problem, an ordered dictionary
object.  I believe the semantics of adding/readding/deleting keys is the
same as java.util.LinkedHashMap -- certainly it seems the most sensible
and easy-to-implement semantics.

Greg
-- 
Greg Ward <[EMAIL PROTECTED]> http://www.gerg.ca/
I brought my BOWLING BALL -- and some DRUGS!!
"""
Provides odict, a subclass of the standard dict type that preserves
insertion order.
"""

__revision__ = "$Id: odict.py,v 1.1 2004/03/09 02:32:08 gward Exp $"

class odict(dict):
"""
An order-preserving mapping: lookup works like a dictionary, but
keys(), values(), items(), etc. all return items in order of
insertion.  (Resetting a key preserves order; deleting a key and
re-adding it moves it to the end.)

Don't use this for enormous dictionaries, since it increases the
memory use of the whole object, and the worst-case runtime for many
common operations is O(N) rather than O(1).
"""

def __init__(self, arg=None):
super(odict, self).__init__()
self.order = []
if isinstance(arg, dict):
self.update(arg)

def copy(self):
return self.__class__(self)

def __delitem__(self, key):
dict.__delitem__(self, key)
self.order.remove(key)

def __iter__(self):
return iter(self.order)

def __setitem__(self, key, value):
add = key not in self
dict.__setitem__(self, key, value)
if add:
self.order.append(key)

def items(self):
return [(key, self[key]) for key in self.order]

def keys(self):
return self.order

def values(self):
return [self[key] for key in self.order]

def iteritems(self):
for key in self.order:
yield (key, self[key])

iterkeys = __iter__

def itervalues(self):
for key in self.order:
yield self[key]

def clear(self):
dict.clear(self)
self.order.clear()

def popitem(self):
"""
Remove and return the most-recently-added item of the mapping.
"""
key = self.order.pop()
value = self[key]
dict.__delitem__(self, key)
return value

def setdefault(self, key, value=None):
if key in self:
return self[key]
else:
self[key] = value
return value

def update(self, other):
for key in other:
self[key] = other[key]

if __name__ == "__main__":
m = odict()
m['foo'] = 43
m['bar'] = 'barf'
m[42] = None
print m.items()
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-08 Thread Brett C.
Steven Bethard wrote:
Delaney, Timothy C (Timothy) <[EMAIL PROTECTED]> wrote:
The perennial "how do I remove duplicates from a list" topic came up on
c.l.py and in the discussion I mentioned the java 1.5 LinkedHashSet and
LinkedHashMap. I'd thought about proposing these before, but couldn't
think of where to put them. It was pointed out that the obvious place
would be the collections module.
For those who don't know, LinkedHashSet and LinkedHashMap are simply
hashed sets and maps that iterate in the order that the keys were added
to the set/map. I almost invariably use them for the above scenario -
removing duplicates without changing order.
Does anyone else think it would be worthwhile adding these to
collections, or should I just make a cookbook entry?

I guess I'm -0 on this.
Though I was the one that suggested that collections is the right
place to put them, I'm not really certain how much we gain by
including them.  I too would only ever use them for removing
duplicates from a list.  But if we're trying to provide a solution to
this problem, I'd rather see an iterable-friendly one.  See a previous
thread on this issue[1] where I suggest something like:
def filterdups(iterable):
 seen = set()
 for item in iterable:
 if item not in seen:
 seen.add(item)
 yield item
Adding this to, say, itertools would cover all my use cases.  And as
long as you don't have too many duplicates, filterdups as above should
keep memory consumption down better.
I am -1 on adding LinkedHash*.  While I can understand wanting to get rid of 
duplicates easily and wanting a good solutoin, Steven's snippet of code shows 
rolling your own solution is easy.

Plus this can even be simplified down to a one-liner using itertools already::
  itertools.ifilterfalse(lambda item, _set=set():
   (item in _set) or (_set.add(item) and False),
 iterable)
I don't think it is the prettiest solution, but it does show that coming up 
with one is not hard nor restricted to only a single solution that requires 
knowing some Python idiom (well, mine does for the default arg to the lambda, 
but Steven's doesn't).

The last thing I want to happen is for Python's stdlib to grow every possible 
data structure out there like Java seems to have.  I don't ever want to have to 
think about what *variant* of a data structure I should use, only what *type* 
of data structure (and no, I don't count collections.deque and Queue.Queue an 
overlap since the latter is meant more for thread communication, at least in my 
mind).

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-08 Thread Steven Bethard
Delaney, Timothy C (Timothy) <[EMAIL PROTECTED]> wrote:
> The perennial "how do I remove duplicates from a list" topic came up on
> c.l.py and in the discussion I mentioned the java 1.5 LinkedHashSet and
> LinkedHashMap. I'd thought about proposing these before, but couldn't
> think of where to put them. It was pointed out that the obvious place
> would be the collections module.
> 
> For those who don't know, LinkedHashSet and LinkedHashMap are simply
> hashed sets and maps that iterate in the order that the keys were added
> to the set/map. I almost invariably use them for the above scenario -
> removing duplicates without changing order.
> 
> Does anyone else think it would be worthwhile adding these to
> collections, or should I just make a cookbook entry?

I guess I'm -0 on this.

Though I was the one that suggested that collections is the right
place to put them, I'm not really certain how much we gain by
including them.  I too would only ever use them for removing
duplicates from a list.  But if we're trying to provide a solution to
this problem, I'd rather see an iterable-friendly one.  See a previous
thread on this issue[1] where I suggest something like:

def filterdups(iterable):
 seen = set()
 for item in iterable:
 if item not in seen:
 seen.add(item)
 yield item

Adding this to, say, itertools would cover all my use cases.  And as
long as you don't have too many duplicates, filterdups as above should
keep memory consumption down better.

On the other hand, if someone could give me a few other reasonable use
cases for LinkedHashSet and LinkedHashMap, I wouldn't object to their
inclusion.

Steve

[1] http://mail.python.org/pipermail/python-list/2005-February/264179.html

-- 
You can wordify anything if you just verb it.
--- Bucky Katt, Get Fuzzy
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-08 Thread Delaney, Timothy C (Timothy)
The perennial "how do I remove duplicates from a list" topic came up on
c.l.py and in the discussion I mentioned the java 1.5 LinkedHashSet and
LinkedHashMap. I'd thought about proposing these before, but couldn't
think of where to put them. It was pointed out that the obvious place
would be the collections module.

For those who don't know, LinkedHashSet and LinkedHashMap are simply
hashed sets and maps that iterate in the order that the keys were added
to the set/map. I almost invariably use them for the above scenario -
removing duplicates without changing order.

Does anyone else think it would be worthwhile adding these to
collections, or should I just make a cookbook entry?

Cheers.

Tim Delaney
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com