from:"Christoph Zwerschke"

This is probably a FAQ, but I dare to ask it nevertheless since I 
haven't found a satisfying answer yet: Why isn't there an "ordered 
dictionary" class at least in the standard list? Time and again I am 
missing that feature. Maybe there is something wrong with my programming 
style, but I rather think it is generally useful.

I fully agree with the following posting where somebody complains why so 
very basic and useful things are not part of the standard library:
http://groups.google.com/group/comp.lang.python/msg/e652d2f771a49857

Are there plans to get it into the standard lib sometime?

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

> What answers have you received that have not been satisfactory?

I googled a little bit and haven't found many answers at all. Some were 
in the posting I mentioned: Stuff should go into the standard lib only 
when it is mature and "right" enough. However, we are already at Python 
2.4 and there is still no ordered dictionary, though there is a lot of 
other specialized stuff.

> Some possible answers: The native dict is very fast, partly because
> the implementation doesn't need to ensure any particular ordering.

Ok, but that's not an argument against providing ordered dictionaries as 
well.

> Ordering the keys isn't the normal case, and can be done easily when
> needed.

That depends. Maybe I do not want the keys to be sorted alphabetically, 
but according to some criteria which cannot be derived from the keys 
themselves.

> You asked "why not" rather than "has anyone done this anyway"; if
> you asked the latter of the Python Cookbook, you might find something
> like this.

Yes, I also found that others have done it more than once, and I know 
that it's not so difficult to do. There are at least two recipes in the 
mentioned cookbook and there is odict in pythonutils. The question was 
why is it not available in the *standard* lib.

> In what cases do you find yourself needing a dict that preserves its
> key order? Can you present a use case that would be improved by an
> ordered dict?

There are too many different situations and it would be too much to 
explain them here, usually in the case mentioned above where the keys 
are not sorted alphabetically. I often solve them by using two data 
structures, a list or tuple, plus a dictionary. For instance, the list 
could contain the names of database fields which shall be output in a 
certain order, and the dictionary values could contain human readable 
description of the database fields for headers or something. Instead of 
maintaining both data structures I feel it would be more natural to use 
only one ordered dictionary.

> For my part, I consider it a virtue of Python that the standard
> library doesn't change rapidly. It allows many competing
> implementations to be shaken out before everyone starts depending on
> any one of them.

Ok, but this can be used as an argument to not add anything to the 
standard lib any more. There are already enough competing 
implementations. Also, the implementation details are not so important, 
there must be only agreement on the interface and behavior which should 
not be so difficult in this case.

I simply wanted to ask why it is not available in the standard lib, 
since I simply don't know

- has it not been demanded loud enough?
- is it really not needed (if you need it it shows you are doing 
something wrong)?
- because nobody presented a satisfying implementation yet?
- are there hidden difficulties or controversial issues?

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

[EMAIL PROTECTED] schrieb:
> By ordered dict, one usually wants order that is arbitary which cannot
> be derived from the content, say insertion order(most ordered dict I
> saw use this order).
> I am writing a web applications(simple forms) which has a number of
> fields. Each field naturally has a name and a number of
> attributes(formatting etc.), like this :
> d = {'a':{...}, 'b':{}}

Things like that are also my typical use case. The keys usually contain 
things like database fields or html form fields, the values contain the 
corresponding description, formatting, data type or data itself etc.

The example above is a bit misleading, because using 'a', 'b' as keys 
can give the impression that you just have to sort() the keys to have 
what you want. So let's make it more realistic:

d = { 'pid': ('Employee ID', 'int'),
   'name': ('Employee name', 'varchar'),
   'sal': ('Salary', 'float') }

Now if I want these things to be presented in this order, I need to run 
through a separate list ('pid', 'name', 'sal') that holds the order.

Ok, you could simply use a list instead of a dictionary:

d = ( ('pid', 'Employee ID', 'int'),
   ('name', 'Employee name', 'varchar'),
   ('sal', 'Salary', 'float') )

This works as long as you *only* have to go through the list 
sequentially. But maybe you want to print the name with its description 
at some other place as well. Now how do you access its description 
'Employee name' so easily?

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Fredrik Lundh wrote:
> if you restructure the list somewhat
> d = (
> ('pid', ('Employee ID', 'int')),
> ('name', ('Employee name', 'varchar')),
> ('sal', ('Salary', 'float'))
> )
> you can still loop over the list
> ...
> but you can easily generate an index when you need it:
> index = dict(d)

That's exactly the kind of things I find myself doing too often and what 
I was talking about: You are using *two* pretty redundant data 
structures, a dictionary and a list/tuple to describe the same thing. 
Ok, you can use a trick to automatically create the dictionary from the 
tuple, but still it feels somewhat "unnatural" for me. A "ordered 
dictionary" would be the more "natural" data structure here.

I also wanted to mention the uglyness in the definition (nested tuples), 
but then I understood that even an ordered dictionary would not 
eliminate that uglyness, since the curly braces are part of the Python 
syntax and cannot be used for creating ordered dictionaries anyway. I 
would have to define the ordered dictionary in the very same ugly way:

d = odict(('pid', ('Employee ID', 'int')),
('name', ('Employee name', 'varchar')),
('sal', ('Salary', 'float')))

(Unless the Python syntax would be extend to use double curly braces or 
something for ordered dictionaries - but I understand that this is not 
an option.)

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Ben Finney wrote:

> Without an example, it's hard to know what you want to do and whether
> an ordered dictionary is the best way to do it.

I have indicated an example, discussed in more detail in another subthread.

>> There are already enough competing implementations.
> Have they been sufficiently shaken out to show a clearly superior
> version? Is any version sufficiently beneficial to write a PEP for its
> inclusion in the standard library?

At least it shows I'm not the only one who thinks ordered dictionaries 
may be sometimes nice to have.

 >> I simply wanted to ask why it is not available in the standard lib, 
 >> since I simply don't know
 >> - has it not been demanded loud enough?
> Loud demands don't count for much. PEPs with popular working
> implementations do.

Sorry, I did not mean "loud enough" but "often enough". The same what 
you are calling "popular."

>> - because nobody presented a satisfying implementation yet?
> I'm not sure what you mean by "satisfying".

You can take your own definition: "sufficiently shaken out", "working", 
"popular", and "succifiently beneficial" and "proven (to the BDFL's 
criteria)".

> Another possibility: ordered dictionaries are not needed when Python
> 2.4 has the 'sorted' builtin.

The 'sorted' function does not help in the case I have indicated, where 
"I do not want the keys to be sorted alphabetically, but according to 
some criteria which cannot be derived from the keys themselves."

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

[EMAIL PROTECTED] wrote:
> Personally, I have needs for ordered dict but I don't think it should
> be in standard library though, as different situation called for
> different behaviour for "ordered" and skewing my code to a standard lib
> way is no good.

I have started the thread in the first place because I believed it is 
pretty unabmiguous what an "ordered dictionary" is and how it should 
behave. That's why I asked myself why something that straigthforward has 
not been added to the standard lib yet. Maybe I'm wrong; I must admit 
that I haven't meditated about it very much.

Do you have an example for different options of behavior?

-- Christoph

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Ben Finney wrote:

 >>> Another possibility: ordered dictionaries are not needed when Python
 >>> 2.4 has the 'sorted' builtin.

Christoph Zwerschke wrote:

 >> The 'sorted' function does not help in the case I have indicated,
 >> where "I do not want the keys to be sorted alphabetically, but
 >> according to some criteria which cannot be derived from the keys
 >> themselves."

Mike Meyer wrote:

 > And how would an ordered dictionary help in this case?

Maybe there is some confusion between an "orderable" and an "ordered" 
dictionary. When I talk about "ordered dictionary", then in the simplest 
case I just set up my ordered dictionary with my preferred key order and 
it stays like that. This allows me to later iterate through the 
dictionary in this preferred order, while being still able to randomly 
access data from the dictionary at other places.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Fredrik Lundh wrote:
> (as an example, on my machine, using Foord's OrderedDict class
> on Zwerschke's example, creating the dictionary in the first place
> takes 5 times longer than the index approach, and accessing an
> item takes 3 times longer.  you can in fact recreate the index 6
> times before OrderedDict is faster; if you keep the index around,
> the OrderedDict approach never wins...)

You're right; I found creating a Larosa/Foord OrderedDict in this 
example to be even 8 times slower than an ordinary dict. However, two 
things need to be said here: 1) The dictionary in my exmaple was pretty 
small (only 3 items), so you are not really measuring the performance of 
the ordered dict, but mainly the overhead of using a user derived class 
in comparison with the built-in dict type. And 2) the implementation by 
Larosa/Foord is very slow and can be easily improved, for instance like 
that:

def __init__(self, init_val = ()):
 dict.__init__(self, init_val)
 self.sequence = [x[0] for x in init_val]

With this change, creating the ordered dictionary is considerably faster 
and for an average size dictionary, the factor of 8 pretty quickly 
converges against 1.

But of course, it will always be slower since it is constructed on top 
of the built-in dict. In end effect, you always have to maintain a 
sequence *plus* a dictionary, which will be always slower than a sheer 
dictionary. The ordered dictionary class just hides this uglyness of 
having to maintain a dictionary plus a sequence, so it's rather an issue 
of convenience in writing and reading programs than a performance issue.

It may be different if the ordered dict would be implemented directly as 
an ordered hash table in C.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Alex Martelli wrote:

> Note the plural in 'insertion orderS': some people care about the FIRST
> time a key was added to a dict, some about the LAST time it was added,
> some about the latest time it was 'first inserted' (added and wasn't
> already there) as long as it's never been deleted since that occasion --
> and these are just a few of the multifarious orders based on the time of
> insertions and deletions of keys.

Ok, I start to understand that ambiguity emerges when you delete and 
insert items. I didn't think much about this problem because  my use 
cases usually do not involve inserttion or deletion after the ordered 
dictionary has been created.

But I think the following rule is "natural" enough to consider it as THE 
standard behavior of ordered dictionaries:

"Insertion: If the key exists: Don't change the order. If it does not 
exist: Append it to the sequence of keys. Deletion: Remove from the 
sequence of keys."

I think this is also the behavior of associative arrays in PHP or Perl 
and could be considered as the "ONE unambiguous definition".

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Fredrik Lundh wrote:
> I'll repeat this one last time: for the use cases presented by Zwerschke
> and "bonono", using a list as the master data structure, and creating the
> dictionary on demand, is a lot faster than using a ready-made ordered
> dict implementation.  if you will access things via the dictionary a lot,
> you can cache the dictionary somewhere.  if not, you can recreate it
> several times and still get a net win.

You're right in pointing out that the advantage of ordered dictionaries 
(unless you use an omptimized C implementation) is not a performance gain.

But please see my other reply: If the dictionary has more than 3 items 
(say 10 or 20), and an effective ordered dict is used, it's not really 
"a lot" slower. At least if we are talking about a situation were "on 
demand" is "always". So, on the other side there isn't such a big 
performance loss when using ordered dictionaries as well.

The advantage of using an ordered dictionary is that you can set up your 
ordered dictionary (say, describing your database columns) once, and 
then can access it in any way you like in the following: Iterate over it 
in a guaranteed order or access item, always refering to the same 
object, without needing to care about building and caching auxiliary 
objects with different names depending on what you are doing.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Alex Martelli schrieb:
> Perl hashes now keep track of 'order of keys'?  That's new to me, they
> sure didn't back when I used Perl!

Maybe I shouldn't have talked about Perl when I'm an ignoramus about 
that language... You're right, Perl has unordered arrays. That was new 
to me since I associate the term "array" always with "ordered" and I 
just remembered that PHP's assoc arrays are similar to Perl's but in 
fact, and the examples I have read did not mention about that problem.

> What about PHP?

You can conclude that PHP's assoc arrays are ordered from the fact that 
the language provides a ksort() function to order the keys. And I think 
PHP's insertion order is the one I mentioned in my last post.

Anyway, it would be interesting to examine this in detail and how this 
is implemented in other languages.

> "first insertion (since the last deletion if any)" is ONE unambiguous
> definition, but surely not "_the_ ONE" with emphasis on ``the''.
> I see nothing _ambiguous_ (nor _unnatural_) in being interested in the
 > *last* insertion, for example; indeed if phrased as "upon insertion, put
 > the key at the end of the sequence" (whether it was already elsewhere in
 > the sequence of not), with no need for conditionals regarding previous
> existence, it might appear more "conceptually compact".

But it would not satisfy the concept of "keys of a dictionary" which are 
always unique.

BTW, there are some boundary conditions that should be fulfilled for the 
insertion order, most obviously:

If you define an ordered dict that way:

d = odict()
d['a'] = 1
d['b'] = 2
d['c'] = 3

The keys should then be orderes as ('a', 'b', 'c').

> Anyway -- subclassing dict to implement your definition is reasonably
> easy, and we could put the resulting package on the Cheese Shop.  I hope
> python.org keeps good enough statistics to be able to tell us, a couple
> months later, how many people downloaded said package, vs how many
> people downloaded a complete Python distro; of course, that ratio is
> biased (in favour of the package) by the fact that many people already
> have a complete distro available, while initially nobody would have the
> package, but if we measure when things settle, after letting a month of
> two or 'transient' pass, that effect might be lessened.

That would be also biased (in favour of Python) by the fact that 
probably very little people would look for and use the package in the 
cheese shop if they were looking for ordered dicts. I for example would 
probably use ordered dicts if they were part of the standard lib, but 
not if I have to install it as an obscure separate package. So I don't 
think that will give us a real clue how many people would like ordered 
dicts in the standard lib.

But anyway, if I find some time, I will research a little bit more about 
the issue and create such a package, because it seems to me that the 
existing packages and recipes are not really satisfying and you're right 
it seems to be reasonably easy. It's on my todo list now...

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Bengt Richter schrieb:
> Ok, so if not in the standard library, what is the problem? Can't find what
> you want with google and PyPI etc.? Or haven't really settled on what your
> _requirements_ are? That seems to be the primary problem people who complain
> with "why no sprollificator mode?" questions.

What I don't understand is why legitimate questions such as "why are 
there no ordered dictionaries" are immediately interpreted as 
*complaints* and not just as questions. If I ask such a question, I am 
not complaining but trying to simply figure out *why* there is no such 
thing. Probably there are reasons and all I want to know is find these 
reasons and learn a little bit more about Python in doing so.

Why can't such questions be discussed in a factual, calm and friendly way?

 > They don't know what they really mean when it comes down to a DYFR
 > (Define Your Felicitous Requirements) challenge.

I don't think that this was true in this case, and even if this is the 
outcome, those who asked the question will have learned something.

I think a discussion group is not there for only presenting mature, 
sophisticated thoughts and concepts, but also for "thinking loud" 
together with other about these issues. We all know that clarifying our 
thoughts works often best if you discuss them with others. And I think 
that's one purpose of discussion lists. Asking questions should not be 
immediately be discouraged, even silly questions. If it is really a FAQ, 
you can simply point to the FAQ or add the answer in the FAQ list if it 
is missing there.

-- Chris
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Bengt Richter wrote:

 > d = OrderedDict(); d[1]='one'; d[2]='two' =>> list(d) => [1, 2]
 > ok, now we do d[1]='ein' and what is the order? list(d) => [2, 1] ??
 > Or do replacements not count as "insertions"?

If you simply set a value for a key that already exists, the order 
should not be changed. I think this is intuitive.

>  Or maybe you want to permit append and NOT prevent
 > [('a',1), ('a':2)] and maybe d['a'] => [1, 2] ???

You could ask the same question about dict. I think that is not an 
option. Why should you want odict behave different than dict?

I still believe that the concept of an "ordered dictionary" ("behave 
like dict, only keep the order of the keys") is intuitive and doesn't 
give you so much scope for ambiguity. But probably I need to work on an 
implementation to become more clear about possible hidden subtleties.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Python in Optimized mode with /usr/bin/env

Paulo Eduardo Neves schrieb:
> I want to run an optimized python using the portable /usr/bin/env, but
> the obvious ways aren't working.

Seems to be a Linux problems others also experienced:
http://blog.ianbicking.org/shebang.html

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Alex Martelli wrote:

> Perl's _arrays_ are a bit like Python _lists_, and ordered; it's the
> _hashes_ that are a bit like Python _dicts_, and unordered. PHP's use
> of "array" for both concepts may indeed be a bit confusing.

Perl's _hashes_ have been also called _associative arrays_ originally.

>>Anyway, it would be interesting to examine this in detail and how this
>>is implemented in other languages.

Ok, I just did a little research an compared support for ordered dicts
in some other languages:

* Perl has hashes (associative arrays) which are not ordered.
Here also people are asking for and implementing "ordered hashes",
e.g. http://perltraining.com.au/tips/2005-06-29.html
http://search.cpan.org/dist/Tie-IxHash/lib/Tie/IxHash.pm
http://search.cpan.org/dist/Tie-InsertOrderHash/InsertOrderHash.pm
http://www.yapc.org/America/previous-years/19100/schedule/author/pinyan.html

* Ruby hashes are not ordered.
Again people are asking for and implementing "ordered hashes".
e.g. http://raa.ruby-lang.org/project/orderedhash/
http://groups.google.com/group/comp.lang.ruby/browse_frm/thread/8ebe8d1c5830c801/6428a870925f23f4

* Smalltalk: Innately has unordered Dictionaries.
People are asking for and implementing "OrderedDictionaries" as well,
e.g. http://exept.eu.org:8080/ClassDoc/classDocOf:,OrderedDictionary

* Lisp has (ordered) "association lists".

* PHP has ordered associative arrays.

* Awk, TCL: Associative arrays are unordered.

* C++ has a Map template in the STL which is ordered (a "Sorted
Associative Container").

* Java: Has "LinkedHashMap" which is ordered.

So ordered dictionaries don't seem to be such an exotic idea.

All implementations I found were pretty unequivocally the same that I
had in mind, using insertion order, appending the latest inserted keys
at the end, not changing the order if an existing key is re-inserted,
and deleting keys together with entries.

I also found a discussion thread like the current where similar
arguments were mentioned for and against ordered dictionaries:

In http://mail.python.org/pipermail/python-dev/2005-March/052041.html,
Nick Coghlan raises the following rhetorical question:

Would the default semantics below really be that suprising?

"An ordered dictionary remembers the order in which keys are first seen
and, when iterated over, returns entries based on that order. This
applies to direct iteration, iteration over values and (key, value)
pairs, to the list-producing methods (i.e. keys(), values() and items())
and to any other operations that involve implicit iteration (e.g.
converting to a string representation). Overwriting an entry replaces
its value, but does not affect its position in the key order. Removing
an entry (using 'del') _does_ remove it from the key order. Accordingly,
if the entry is later recreated, it will then occur last in the key
order. This behaviour is analagous to that of a list constructed using
only list.append() to add items (indeed, the key order can be thought of
as a list constructed in this fashion, with keys appended to the list
when they are first encountered)."

This was also the semantics I immediately had in mind when I thought
about ordered dictionaries, though I hadn't read anything about ordered
dictionaries before and it is the same semantics that I found others
have implemented in other languages.

I can't help but I still find it unambiguous and intuitive enough to
consider it "the one" standard implementation for ordered dictionaries.

Also, in the use cases mentioned (describing database columns, html form
fields, configuration parameters etc.), the dictionary is usually only
created once and then not changed, so different handling of re-insertion
or deletion of keys would not even be relevant in these cases.

-- Christoph
--
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Fuzzyman schrieb:
> Of course ours is ordered *and* orderable ! You can explicitly alter
> the sequence attribute to change the ordering.

What I actually wanted to say is that there may be a confusion between a 
"sorted dictionary" (one where the keys are automatically sorted) and an 
"ordered dictionary" (where the keys are not automatically ordered, but 
have a certain order that is preserved). Those who suggested that the 
"sorted" function would be helpful probably thought of a "sorted 
dictionary" rather than an "ordered dictionary."

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Bengt Richter wrote:
> I'm mostly friendly ;-)

I'm pretty sure you are :-)

-- Chris
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Bengt Richter wrote:
> After finally reading that the odict.py in PyPI by Larosa/Foord was what was 
> desired,
> but slower in use than what Fredrik posted, I decided to see if I could speed 
> up odict.py.
> I now have a version that I think may be generally faster.

Hm, I wouldn't formulate it that way that I want the odict of 
Larosa/Foord, but I want "the one" "obvious" odict for which 
Larosa/Foord have already made one implementatin ;-)

Others are here:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/438823
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/107747
http://pleac.sourceforge.net/pleac_python/hashes.html#AEN250

It is all essentially the same idea I think (though after having a 
closer look I see implementation shortcomings in all of them).

 > I now have a version that I think may be generally faster.

Great. I also wanted to do that. Also, I would like to add some 
functionality to Larosa/Foord's odict, like creating or updating an 
odict from an ordinary dict (keys taken over from the ordinary dict will 
be either in random order or automatically sorted). An ordered 
dictionary should also have methods for sorting (like PHP's ksort()).
This way, you could initialize an ordered dict from an ordinary dict, 
sort it, and from then on never care to call keys().sorted() or 
something when iterating over the dictionary. Probably there are other 
methods from lists that could be taken over to ordered dicts.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

One implementation detail that I think needs further consideration is in 
which way to expose the keys and to mix in list methods for ordered 
dictionaries.

In http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/107747
the keys are exposed via the keys() method which is bad. It should be a 
copy only, like for ordinary dicts (one comment also mentions that).

In Foord/Larosa's odict, the keys are exposed as a public member which 
also seems to be a bad idea ("If you alter the sequence list so that it 
no longer reflects the contents of the dictionary, you have broken your 
OrderedDict").

I think it would be probably the best to hide the keys list from the 
public, but to provide list methods for reordering them (sorting, 
slicing etc.).

For instance:

d1 = OrderedDict( (1, 11), (2, 12), 3, 13) )

d1[1:] ==> OrderedDict( (2, 12), 3, 13) )

d1[0] + d1[2] ==> OrderedDict( (1, 11), (3, 13) )

d1.reverse() ==> OrderedDict( (3, 13), (2, 12), 1, 11) )

d1.insert(1, (4, 14))
 ==> OrderedDict( (1, 11), (4, 14), (2, 12), 3, 13) )

etc.

But no other way to directly manipulate the keys should be provided.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Magnus Lycka schrieb:
>> I still believe that the concept of an "ordered dictionary" ("behave 
>> like dict, only keep the order of the keys") is intuitive and doesn't 
>> give you so much scope for ambiguity. 

> Sure. Others think so too. The problem is that if you and these other
> people actually write down exactly how this unambigous ordered dict
> should behave, you'll end up with a dozen different sets of semantics,
> which can be divided in at least three main groups.

That's the point where I dare to disagree. As I have pointed out in 
another posting in this thread, all other implementations have the same 
semantics for the basic behavior. I cannot see three different groups.

Again, what's so surprising as the "natural" semantics described here:
http://mail.python.org/pipermail/python-dev/2005-March/052041.html

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

 >>def __init__(self, init_val = ()):
 >> dict.__init__(self, init_val)
 >> self.sequence = [x[0] for x in init_val]

Fuzzyman wrote:

 > But that doesn't allow you to create an ordered dict from another
 > ordered dict.

Right; I did not want to present a full implementation, just the 
relevant part. Of course, a real implementation should also allow to 
build an ordered dict from another ordered dict or an ordinary dict. (In 
the latter case, maybe the keys should be automatically sorted.) But one 
or two case disctinctions would not make things mentionable slower.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Carsten Haese schrieb:
> I don't think it's intuitive if you can't describe it without
> contradicting yourself. If the order of the keys really were the order
> in which they were first seen by the dictionary, deleting and recreating
> a key should maintain its original position.

Admitted that description was not perfect (anyway it was not mine ;-)).

If a key is deleted, it is deleted. I don't think it is intuitive to 
expect that the dict "remembers" a deleted item. If it is added again, 
it is like it is seen for the first time and thus appended.

I don't think your argument viliates what I said in my last post.

-- Chris
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Carsten Haese wrote:

> That could easily be fixed by making the sequence a "managed property"
> whose setter raises a ValueError if you try to set it to something
> that's not a permutation of what it was.

Ok, a managed attribute would be an option. You would allow people to do 
what they want with the sequence, and then fix the dictionary 
accordingly (if items were deleted from the sequence, they are deleted 
from the dictionary, it items were added which are not in the directory, 
a ValueError is raised etc.).

But probably it is still better to hide the sequence and instead of 
letting people do list operations like sort() on the odict.sequence, let 
them do these operations on odict directly.

>>d1[0] + d1[2] ==> OrderedDict( (1, 11), (3, 13) )
> What do you think you're doing here?

Sorry, what I meant was

d1[0:0] + d1[2:2] ==> OrderedDict( (1, 11), (3, 13) )

Ordered dictionaries could allow slicing and concatenation.

-- Christoph

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

>>I still believe that the concept of an "ordered dictionary" ("behave 
>>like dict, only keep the order of the keys") is intuitive and doesn't 
>>give you so much scope for ambiguity. But probably I need to work on an 
>>implementation to become more clear about possible hidden subtleties.

Bengt Richter schrieb:

> Does that mean that the Larosa/Foord odict.py implementation in PyPI
> does not do what you want?

Basically, it is what I had in mind. But I'm now seeing some things that 
could be improved/supplemented, e.g.
- performance improvements
- the internal keys list should be hidden
- list methods should be mixed in instead

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

> d1[0:0] + d1[2:2] ==> OrderedDict( (1, 11), (3, 13) )

Oops, sorry, that was nonsense again. I meant
d1[0:1] + d1[1:2] ==> OrderedDict( (1, 11), (3, 13) )

> Ordered dictionaries could allow slicing and concatenation.

Some operations such as concatenation need of course special 
considerations, since the keys must stay unique. A concatenation of 
ordered dicts with overlapping keys should probably give an IndexError.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Steve Holden schrieb:
> Perhaps now the answer top your question is more obvious: there is by no 
> means universal agreement on what an "ordered dictionary" should do. 
> Given the ease with which Python allows you to implement your chosen 
> functionality it would be presumptuous of the core developers to favour 
> any one of the several reasonable alternatives that might be chosen.

The discussion showed indeed that there are some points which are not so 
straightforward as I believed. But I still don't see the "several 
reasonable alternatives". So far all implementations I have seen are 
basically pretty similar. Is there any implementation that is basically 
different from Foord/Larosa's odict? I still don't see a principal 
problem here, just the problem that the existing implementations are not 
well enough thought-out and sophisticated enough.

 > Given the ease with which Python allows you to implement your chosen
 > functionality it would be presumptuous of the core developers to
 > favour any one of the several reasonable alternatives that might be
 > chosen.

You could say the same about the idea of sets in Python, and it is 
probably the reason why it took so much time until they became a part of 
Python. I wished that had happened earlier, since sets are "the" basic 
mathematic structure. I'm now very happy to have sets not only in the 
standard lib but even as built-in types and I'm sure there hasn't been 
"universal agreement" on how sets should be implemented either. I don't 
think it was presumptuous to finally settle down on one implementation.

Of course, ordered dictionaries are no candidates for becoming built-in 
data types, and the existing implementations are not sophisticated and 
mature enough to go to the standard lib right now. But principally, if 
an improved version is developed (maybe on the occasion of this thread) 
and will be tested and proven, why should it not go to the standard lib 
some day?

BTW, somebody has suggested it could go to "collections" which sounds 
like the right place, but the description says the module is intended 
for "High-performance container datatypes", and, as has been discussed, 
ordered dictionaries are not used for performance or efficiency reasons, 
but rather for reasons of convenience. So *if* they ever go to the 
standard lib, I'm not sure whether "collections" would be the right 
place. Or collections will need a different description - maybe there 
are other interesting basic collection types which are chosen for 
convenience, not for performance (for instance, ordered sets)?

-- Christoph

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

[EMAIL PROTECTED] schrieb:
> It seems to be though as "ordered dictionary" are slowly to be confined
> to only "ordered on order of change to the dictionary".

"Ordered dictionary" means that the keys are not an ordinary set like in 
an ordinary dictionary, but an "ordered set." I think this definition is 
pretty straightforward and common. An ordered set is the same as a 
unique list, i.e. a list where all elements are unique.

When there is automatical ordering using a comparison function, I would 
not speak of an "ordered directory", but of a "sorted directory." This 
would be basically a dictionary plus a comparison function. The keys 
would always be iterated in the order given by the comparison function. 
It would be nice to have a sorted dictionary as well, even if you can 
achieve the same by calling the sorted built-in on the keys. Think of a 
sorted directory as a subclass of ordered directories. Implementing it 
that way would even have perfomance benefits if the keys are inserted 
using the bisect algorithm. This is better than calling sorted() on the 
keys of an ordinary dictionary many times.

By the way, you will find the same terminology in Smalltalk, where 
"SortedCollection" is a subclass of "OrderedCollection".

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Bengt Richter wrote:
 > I think the concept has converged to a replace-or-append-by-key ordering
 > of key:value items with methods approximately like a dict. We're now
 > into usability aspects such as syntactic sugar vs essential primitives,
 > and default behaviour vs selectable modes, ISTM.

Yes, and we would be good if we do not stop the discussion at this point 
with nothing, but really create such a sophisticated implementation. 
Whether it will become popular or go to the standard lib some day is a 
completely different question.

 > E.g., it might be nice to have a mode that assumes d[key] is 
d.items()[k][1] when
 > key is an integer, and otherwise uses dict lookup, for cases where 
the use
 > case is just string dict keys.

I also thought about that and I think PHP has that feature, but it's 
probably better to withstand the temptation to do that. It could lead to 
an awful confusion if the keys are integers.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

>>* C++ has a Map template in the STL which is ordered (a "Sorted 
>>Associative Container").

Alex Martelli wrote:

> Ordered *by comparisons on keys*, NOT by order of insertion -- an
> utterly and completely different idea.

Shame on me. I talked so much about the difference between "ordered" and 
"sorted" and now I wrote such a confusing sentence. You're right, C++ 
Maps are not an example for "ordered dictionaries", but for "sorted 
dictionaries".

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Duncan Booth schrieb:
> In Javascript Object properties (often used as an associative array) are 
> defined as unordered although as IE seems to always store them in the order 
> of original insertion it wouldn't surprise me if there are a lot of 
> websites depending on that behaviour.

You're right with both. The ECMA language definition says object 
properties are an unordered collection, but MSIE and probably other 
browsers keep the order in which they were created. Of course one should 
not rely on that.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Alex Martelli wrote:

> However, since Christoph himself just misclassified C++'s std::map as
> "ordered" (it would be "sorted" in this new terminology he's now
> introducing), it seems obvious that the terminological confusion is
> rife.  Many requests and offers in the past for "ordered dictionaries"
> (e.g. on this group) were also "sorted", NOT "ordered", in this new
> terminology.

I'm sorry again. Please don't conclude that others are as uneducated as 
I am (I haven't studied computer science but mathematics). Speaking 
about "ordered" and "sorted" in the context of collections is not a new 
terminology I am introducing, but seems to be pretty common in computer 
science (see, e.g. 
http://www.gamespp.com/java/dataStructuresInJavaPart6.html).

Perhaps Pythonists are not used to that terminology, since they use the 
term "list" for an "ordered collection". An ordered dictionary is a 
dictionary whose keys are a (unique) list. Sometimes it is also called a 
"sequence" (therefore in the odict implementation, the keys attribute 
has this name). A unique list, in turn, can also be called an "ordered 
set", so you can also understand an ordered dictionary as a dictionary 
whose keys are an ordered set. I think all of this is pretty logical ;-)

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Ganesan Rajagopal wrote:

> the definition of "sorted" and "ordered", before we can > go on ? Sorted 
> would be ordered by key comparison. Iterating over such a container will 
> give you the keys in sorted order. Java calls this a SortedMap. See 
> http://java.sun.com/j2se/1.4.2/docs/api/java/util/SortedMap.html C++ STL 
> map container is also a Sorted Associative container. See 
> http://www.sgi.com/tech/stl/Map.html  Ganesan

Exactly, that's "sorted." "Ordered" means the same there is some order 
between the existing elements, but there is no magic (i.e. a general 
comparison function) for ordering new elements. Thus, if you add an 
element to an ordered collection, it simply gets appended (is considered 
as the greatest of all elements) by convention, whereas if you add an 
element to a sorted collection, it will be inserted into the correct 
place by using the comparison function.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Alex Martelli schrieb:

> I detest and abhor almost-sequences which can't be sliced (I consider
> that a defect of collections.deque).  If the ordered dictionary records
> by its sequencing the time order of key insertion, being able to ask for
> "the last 5 keys entered" or "the first 3 keys entered" seem to me to be
> perfectly natural use cases, and most naturally accomplished by slicing
> of course, d[-5:] and d[:3] respectively.

I agree. A use case was requested: Say your dictionary holds form 
fields, and you know the last field is always a hidden field you wont to 
ignore in your output, or your dictionary holds attributes of a 
database, and you don't want to print the first attribute which is 
always some kind of uninteresting OID, then you would write
"for field in d[1:]" or "for field in d[:-1]".

(Otherwise, you would have to write "for field in d.keys()[1:]" etc.)

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Bengt Richter wrote:

>  >>> from odictb import OrderedDict
>  >>> d1 = OrderedDict([(1, 11), (2, 12), (3, 13)])
>  >>> d1
>  {1: 11, 2: 12, 3: 13}
>  >>> d1[1:]
>  {2: 12, 3: 13}
>  >>> d1[0:1] + d1[2:3]
>  {1: 11, 3: 13}
>  >>> d1.reverse()
>  >>> d1
>  {3: 13, 2: 12, 1: 11}
>  >>> d1.insert(1, (4,14))
>  >>> d1
>  {3: 13, 4: 14, 2: 12, 1: 11}
>  >>> d1.items()
>  [(3, 13), (4, 14), (2, 12), (1, 11)]
>  >>> d1.keys()
>  [3, 4, 2, 1]
>  >>> d1.values()
>  [13, 14, 12, 11]
>  >>> d1[1:2]
>  {4: 14}
>  >>> d1[-1:]
>  {1: 11}
> 
> Que mas?

Eso es exactamente lo que yo queria haber!

-- Chris
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

>> I think it would be probably the best to hide the keys list from the 
>> public, but to provide list methods for reordering them (sorting, 
>> slicing etc.).

Tom Anderson wrote:

> I'm not too keen on this - there is conceptually a list here, even if 
> it's one with unusual constraints, so there should be a list i can 
> manipulate in code, and which should of course be bound by those 
> constraints.

Think of it similar as the case of an ordinary dictionary: There is 
conceptually a set here (the set of keys), but you cannot manipulate it 
directly, but only through the according dictionary methods.

For an ordedred dictionary, there is conceptually a list (or more 
specifically a unique list). Again you should not manipulate it 
directly, but only through methods of the ordered dictionary.

This sounds at first more complicated, but is in reality more easy.

For instance, if I want to put the last two keys of an ordered dict d at 
the beginning, I would do it as d = d[:-2] + d[-2:].

With the list attribute (called "sequence" in odict, you would have to 
write: d.sequence = d.sequence[:-2] + d.sequence[-2:]. This is not only 
longer to write down, but you also have to know that the name of the 
attribute is "sequence". Python's strength is that you don't have to 
keep many details in mind because it has a small "basic vocabulary" and 
orthogonal use. I prefer the ordered dictionary does not introduce new 
concepts or attributes if everything can be done intuitively with the 
existing Python methods and operators.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Fuzzyman wrote:

> So what do you want returned when you ask for d1[1] ? The member keyed
> by 1, or the item in position 1 ?

In case of conflict, the ordered dictionary should behave like a 
dictionary, not like a list. So d1[1] should be the member keyed by 1, 
not the item in position 1. Only in case there is no member keyed by 1, 
the item in position 1 could be returned, but I think that would be too 
adventurous a hattrick and can lead to big confusion. Better to raise a 
KeyError in that case.

>> But no other way to directly manipulate the keys should be provided.

> Why - don't trust yourself with it ?

No, because I think it is not needed if list operations like insert are 
directly possible on your dictionary.

But maybe methods such as setkeys() and setvalues() would be nice to 
have in addition.

Instead of writing d.sequence = new_sequence, I would write 
d.setkeys(new_sequence). But I'm not sure what to do if new_sequence is 
not a permutation of the old one. Raise a KeyError? Or even swallow 
this? For instance

d = OrderedDict((1,11), (2,12))

d.setkeys((2, 1)) ==> d = OrderedDict((2, 11), (1, 12))

d.setkeys((3, 4)) ==> d = OrderedDict((3, 11), (4, 12)) (or KeyError?)

d.setvalues((12, 11)) ==> d = OrderedDict((1, 12), (2, 11))

d.setvalues((13, 14)) ==> d = OrderedDict((1, 13), (2, 14)) (always ok)

(Or maybe better set_keys in analogy to has_key?)

I hesitate making keys and values managed properties, because this would 
conflict with them being methods in ordinary dicts. Ordered dicts should 
resemble ordinary dicts as closely as possible. And giving it a 
different name like "sequence" I find confusing and unintuitive.

A resort could be the following: If keys() is given a sequence as 
argument, then use this as the new sequence of keys, and similar with 
values(). This way, no new methods need to be introduced.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Fuzzyman wrote:

>>- the internal keys list should be hidden

> I disagree. It is exposed so that you can manually change the order
> (e.g. to create a "sorted" dict, rather than one ordered by key
> insertion). What do you *gain* by hiding it ?

See my other posting. Of course hiding the list can only be done, if

>>- list methods be mixed in instead

In this case, you can change the order directly by using the list 
methods on the dictionary. Sorting would be an example. Instead of

d.sequence = sorted(d.sequence)

you could simply write d.sort() which does the same.

> Hmm... I did consider this. Which list methods would you consider
> appropriate ?

Everything method that does not clash with the use as dictionary. For 
instance, both lists and dicts have __getitem__ and __setitem__, so im 
this case, the dictionary method must take precedence. But a dictionary 
has not __getslice__ and __setslice__, so here the list methods can be 
used (__getslice__ is actually deprecated, but you get the idea). In 
some cases, like __delitem__, both have it, but there is no clash.

Other interesting methods are sort() and reverse().

Here, we have another problem however: There is not only the list of 
keys, but also the list of values, and sometimes, as in the case of 
sort() and reverse(), it would be also nice to have it operate on the 
list of values. How should this be done? PHP solves it by using two 
different methods ksort and asort for keys and values. In this notation:

d = OrderedDict( (1,13), (3,12), (2,11) )

d.ksort() ==> d = ( (1,13), (2,11), (3,12) )
d.asort() ==> d = ( (2,11), (3,12), (1,13) )

Similar for reverse().

If the keys() and values() methods would be extended to be setters, then 
d.ksort() = d.keys(sorted(d.keys())) and
d.asort() = d.values(sorted(d.values()))

Anyway, I don't like "ksort" and "asort". If it must be, I'd rather use

d.ksort() = d.sortkeys()
d.asort() = d.sortvalues()

d.sort() could default to one of them (not sure which one).

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Fuzzyman wrote:

> That's not the only use case. Other use cases are to have a specific
> order, not based on entry time.
> 
> Simple example :
> d1 = OrderedDict(some_dict.items())
> d1.sequence.sort()
> d1 is now an ordered dict with the keys in alphabetic order.

As I said, I would not need to access the sequence, if I can write
d1.sort() or d1.sortkeys()

> If you don't want to modify sequence, don't. If you want a copy do :
> seq = d1.sequence[:]

This is not needed since you can do the same with: seq = d1.keys()

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Carsten Haese schrieb:

> Thus quoth the Zen of Python:
> "Explicit is better than implicit."
> "In the face of ambiguity, refuse the temptation to guess."
> 
> With those in mind, since an odict behaves mostly like a dictionary, []
> should always refer to keys. An odict implementation that wants to allow
> access by numeric index should provide explicitly named methods for that
> purpose.

Exactly. But I don't think in this case such methods would be needed. 
You easily get the i-th value in the ordered dict as d.values()[i].

-- Chris


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Accessing a database from a multithreaded application

Alan Kemp schrieb:
> I have a problem that is half python, half design.  I have a
> multithreaded network server working, each client request spawns a new
> thread which deals with that client for as long as it is connected
> (think ftp style rather than http style connections here).  Each thread
> gets passed a reference to the main server to access things like the
> list of connected clients, global data, etc.
 > ...
> Can someone suggest a better (ie, valid) strategy for this?

Have a look at DBUtils (http://www.webwareforpython.org/DBUtils).

Basically, there are two possibilities: Persistent connections that are 
bound to your server threads, or a pool of connections that are 
independent from the server threads. I prefer the first solution if the 
number of server threads stays constant. If the server regularly creates 
and destroys threads, I prefer spooling. DBUtils supports both.

I plan to write a doco describing these ideas and the usage of DBUtils 
in detail. For now you need to get along with the inline docstrings.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Why is dictionary.keys() a list and not a set?

Ok, the answer is easy: For historical reasons - built-in sets exist 
only since Python 2.4.

Anyway, I was thinking about whether it would be possible and desirable 
to change the old behavior in future Python versions and let dict.keys() 
and dict.values() both return sets instead of lists.

If d is a dict, code like:

for x in d.keys():
 ...

or even

for x in sorted(d.keys()):
 ...

would still work and do the same.

However, the older idiom

k = d.keys()
k.sort()
for x in k:
 ...

would break (since you cannot sort a map directly).

So it seems not possible to change the behavior since it would break old 
code. Maybe keys() and values() could be made deprecated functions, 
superseded by keyset() and valueset(), but probably this would not be 
worth the effort.

Another related question: Actually, dicts can be considered as 
subclasses of sets and thus could inherit a lot of set methods and 
operators. Only intersection and union must be treated with care, 
because dictionaries must be mappings (i.e. map to exactly one value).

In fact, some inheritance from sets has already been implemented:

For instance, by allowing the set operator "in" for dictionaries, 
instead of "has_key".

The len(), clear(), copy() and update() functions can also be 
interpreted as inherited from set. The dictionary method popitem() is 
actually the inherited set method pop(). (d.pop() without arguments 
could actually do the same as d.popitem(); I guess the suffix "item" was 
chosen to remind you that it returns a key/value pair, not only a value.)

But could other set methods also be useful?

A dictionary could inherit the remove() and discard() method from sets. 
d.remove(x) would do the same as del d[x], but d.discard(x) would not 
raise a KeyError if x not in d.

Or, for instance, if

a = { 1:11, 2:12 }; b = { 2:22, 3:13 }, c = { 2:32 }

then

c.issubset(b)  ==  c <= b  == True
b.issuperset(c)  ==  b >= c  == True
a.difference(b)  ==  a - b  == { 1:11 }
a.s.symmetric_difference(b) ==  a ^ b  ==  { 1:11, 3:13 }
a.update(b)  ==  a |= b  ==  a = { 1:11, 2:22, 3:13 }
a.intersection_update(b)  ==  a &= b  ==   a = { 2:22 }
a.difference_update(b) ==  a -= b  ==  a = { 1:11 }
a.symmetric_difference_update(b) ==  a ^= b  == a = { 1:11, 3:13 }

Of these, a |= b may be particularly interesting as short notation for 
a.update(b).

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Here is another old posting I just found which again gives the same use 
cases and semantics:

http://mail.python.org/pipermail/python-dev/2005-March/051974.html

"Keys are iterated over in the order that they are added. Setting a
value using a key that compares equal to one already in the dict
replaces the value, but not the key, and does not change the iteration
order. Removing a key (and value) then re-adding it moves the key to the
end of the iteration order."

So far all those who would like to have an ordered dict seem to have the 
same semantics in mind.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Duncan Booth schrieb:
> On IE this will go through elements in the order 0, 1, 2, 4, 3.

Oops! I bet most people would not expect that, and it is probably not 
mentioned in most Javascript tutorials. I think this is a weakpoint of 
the ECMA definition, not MSIE.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Fuzzyman schrieb:

> I'm going to add some of the sequence methods. I'm *not* going to allow
> indexing, but I will allow slicing.
> 
> You can also do d[d.keys()[i]]
> 
> This provides two ways of fetching values by index, so I don't want to
> add another.

And this would be probably faster than d.values()[i] in the usual 
implementation where values() has to be built first as Carsten noted.

-- Chris
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why is dictionary.keys() a list and not a set?

Mike Meyer wrote:
> Are you sure dict.values() should be a set? After all, values aren't
> guaranteed to be unique, so dict(a = 1, b = 1).values() currently
> returns [1, 1], but would return set([1]) under your proposal.

Good point. Contrary to the keys, the values are not unique. Still, it 
would make sense to only return the set (the value set of the mapping) 
because this is usually what you are interested in and you'd think the 
information about which value belongs to which key is lost anyway.

However (see other postings in this thread) this last statement is 
wrong: If you call keys() and values() consecutively, then they are 
guaranteed to correspond to each other. If values() would return a set, 
this would be completely broken.

> What about dict.items()?

Actually, if keys() would be a set, this should return a set, too (the 
set of key/value tuples).

>>For instance, by allowing the set operator "in" for dictionaries,
>>instead of "has_key".
> "in" already works for dicdtionaries:

I know. I wanted to use it as an example where set operations have 
already been made available for dictionaries.

I know, 'in' has existed for lists and tuples long before sets, but 
actually it is a native set operation.

 From a mathematical point of view, it would have been nice if Python 
had defined "set" as a basic data type in the beginning, with lists, 
tuples, dictionaries as derived data types.

By the way, i wonder why the common mathematical notation { 1,2,3 } was 
not allowed for set((1,2,3)). It would not clash with the dictionary 
notation which requires additional colons.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why is dictionary.keys() a list and not a set?

Martin v. Löwis schrieb:

> You answer the question whether it would be possible (no, because of
> backwards compatibility); you don't attempt to answer the question
> whether it would be desirable. Why do you think that would be
> a good idea?

It was meant simply as an open question to be discussed here, I did not 
say that I think it would be a good idea. Sometimes it's good to 
question why certain things are set up the way they are.

I considered the question not completely absurd, because if a language 
supports sets and dictionaries, it would be somewhat consequential that 
the keys of a dictionary are a set. (We also discussed "ordered 
dictionaries" here, where it is obvious that the keys are a list.) At 
least from a mathematical/aesthetical view.

If you consider compatibility and usability aspects, making them sets 
would probably not be a good idea.

> If you want the set of keys of a dictionary d, then do set(d):
>  >>> d={1:2,3:4}
>  >>> set(d)
> set([1, 3])

I know. But this does not answer the question: If the keys *are* already 
a set according to their semantics, why aren't they returned as a set 
from the beginning?

Better answers:

- Compatibility issues, particularly the keys()/values() match hattrick
- Because lists are still more native/pythonic objects than sets

> As Mike Meyer explains, the same is meaningless for .values():
> they might not be unique. For .items(), conversion into a set
> does would appear to be meaningful, but doesn't actually work:
> if the values contain unhashable objects, the set of tuples
> cannot be constructed (as set members must be immutable).

I would not say that a values() set would be meaningless; it is an 
interesting object in itself. It would however lose the information 
about the "frequency" of the values.

But your second objection is a valid one. So I can add to the list of 
reasons why sets are not used for values() and items():

- Because sets can only contain immutable values

This, of course, in turn raises the question ;-) Would it be desirable 
to have an additional, more general set datatype that can contain 
mutable objects? I know about the perfomance drawbacks. And I think I 
know the answer: It would not make much sense, since the implementation 
would be actually not different from a list. So you could take a list 
anyway. Which gives the answer why values() and items() return a list.

Probably the only use of such a general set type would be that it could 
be considered as an abstract supertype for list and value. And in cases 
where the order does not matter, this could be made more clear by using 
sets. (Similar as the reason why you use False and True when you could 
use 0 and 1 instead.)

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why is dictionary.keys() a list and not a set?

> Martin v. Löwis wrote:

>> If you want the set of keys of a dictionary d, then do set(d):
>>  >>> d={1:2,3:4}
>>  >>> set(d)
>> set([1, 3])
> 
> I know. But this does not answer the question: If the keys *are* already 
> a set according to their semantics, why aren't they returned as a set 
> from the beginning?

Sorry. Your answer was good; I missed the point and thought you wrote 
set(d.keys()). Is it documented anywhere that set(d) = set(d.keys())? I 
think this should go into the Python Doco where keys() are explained.

I would have expected that set(d) returns set(d.items()), but as has 
been discussed, this would cause problems with mutable values.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why is dictionary.keys() a list and not a set?

[EMAIL PROTECTED] schrieb:
> You can already get a set from a dictionary's keys in an efficient manner:
l = dict.fromkeys(range(10))
set(l)
> Set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Good point. I expected that set(l) = set(l.items()) and not 
set(l.keys()), but the latter would not work with mutable values. See 
discussion with Martin.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Fuzzyman wrote:

> You will be able to mutate the the keys list through :
> 
> d1 = OrderedDict(some_sequence_of_items)
> keys = d1.keys()
> keys.sort() # or other mutation
> d1.keys(keys)
> 
> Admittedly this is a lot slower than :
> 
> d1 = OrderedDict(some_sequence_of_items)
> d1.sequence.sort()
> 
> *but* it frees the squence attribute from any implementation details.

You should also implement

d1.sort() or d1.sortkeys()

which will have no performance drawbacks.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Bengt Richter schrieb:

>>d.setvalues((13, 14)) ==> d = OrderedDict((1, 13), (2, 14))

> The implication above is that OrderedDict takes an *args argument,
> but really it takes a single argument that is a sequence of k,v pairs,
> (and maybe some keyword options).

Right. Interpret it as a short notation only, I did not want to change 
the interface. I think it's a common mistake to forget the brackets 
because you think one pair should be enough. At least I often forget it.

> You could make keys, values, and items custom descriptors which would define 
> __call__
> for the old key() etc accesses, well as __getitem__ and __setitem__. Then you 
> could write
> 
> d.items[0], d.items[-1] = d.items[-1], d.items[0]
> 
> to swap the order of the first and last items in the thing (I hesitate to say 
> dict ;-)
> You could also operate on slices.

Nice. You could also get the i-th value with d.values[i].

But is this considered good style or would it be considered "dirty" to 
have a callable member that also supports indexing and slicing? (I don't 
know, just asking?) Plus, it opens a can of worms by increasing the 
complexity tremendously. Let's see whether this can be handled.

> BTW, before I showed an example where d[2:3] returned
> a new dict instance rather than d.items()[:]. I think the items list
 > is better, since they naturally add, sort, reverse, etc as lists,
 > and you can write OrderedDict(d[2:3]+d[1:2]) if you want a new dict.

Not sure about that. I would rather expect that if you slice an object, 
you get an object of the same type. And you can also add, sort, reverse, 
etc the ordered dict if you want.

> A slice assignment to d[i:j] is really sugar for something, but we have to 
> decide exactly
> what in case of duplicate keys in the incoming items, either with each other 
> ...

You mean slice assignments to d.items[i:j]. If you make the slice 
assignment to d[i:j] then at least you cannot have conflicts in the 
incoming items. The first question is: Should a slice assignment be 
treated as deletion with subsequential addition (changing the order) or 
as a replacement (i.e. try to not change the order)? I agree that the 
second would be the expected semantics, but as you mentioned more 
difficult to implement.

 > One way to define it would be
 > d.items[i:j] = itemseq
> 
> to be implemented as sugar for
> newitems = d.items[:i] + list(itemseq) + d.items[j:]
> d.clear()
> d.update(newitems)

Which should be the same as
d = OrderedDict(d.items[:i] + list(itemseq) + d.items[j:])
Sounds reasonable.

> So
> d.reverse()
> could be spelled
> d.items[:] = d.items[::-1]

Slice assignment for keys and values would also allow to set a new key 
order with

d.keys[:] = newkeyseq

So no need to define a setkeys() method or let keys() take arguments.
If we want to allow this kind of things at all.

> If you are using slices, you can safely use them directly in the __getitem__ 
> of th dict interface,
> as I did in the "Que mas?" post. So we don't have to write d.items[i:j] and 
> could write d[i:j].
> The thing is, d[i:j] will tempt to skip ".items" in d.items[i] and write 
> d[i], which has the dict
 > key meaning, not the item list index. It is faster not have a 
descriptor between though.

I still think d[i:j] should return an ordered dict, not an item list.

> I think maybe allowing write to keys or values is pretty iffy. There are too 
> many weird
> re-associations of keys and values possible, and which you could do my other 
> means if you
> really really wanted to. But I do think full index and slice read access 
> would be fine.

There are different opinions. Fuzzyman would probably say "Don't trust 
yourself?" I myself am undecided. Perhaps you could expect that somebody 
knows what he does if he makes a slice assignment to d.keys. In any way, 
such an assignment should not "break" the directory (with "broken" I 
mean that the internal keys sequence does not match the keys of the 
internal dictionary). If anything does not match, it should raise a 
KeyException. If it is implemented that way, I think we can allow it.

> I think also that d1==d2 should effectively be implemented as d1[:] == d2[:] 
> -- i.e, compare
> the item lists to implement comparisons.

Wouldn't it be more performant to compare for 
d1.internal_dict==d2.internal_dict and 
d1.internal_sequence==d2.internal_sequence?
You don't keep track of the item lists, they need to be built on every 
occasion.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Fuzzyman schrieb:
> d.keys() will still return a copy of the list, so d.keys()[i] will
> still be slower than d.sequence[i]

Right, I forgot that.  Bengt suggested to implement __call__ as well as 
__getitem__ and __setitem__ for keys, values and items.

In this case, you could very effectively access it as d.values[i].

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why is dictionary.keys() a list and not a set?

>>- Because sets can only contain immutable values

Mike Meyer wrote:

> Not true. Sets can only contain *hashable* objects, which isn't the
> same thing.

I trusted the doco which defines a set as "an unordered collection of 
immutable values" (http://docs.python.org/lib/types-set.html).

Anyway, the problems stays the same.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why is dictionary.keys() a list and not a set?

Alex Martelli wrote:

> Absolutely not in my use cases.  The typical reason I'm asking for
> .values() is because each value is a count (e.g. of how many time the
> key has occurred) and I want the total, so the typical idiom is
> sum(d.values()) -- getting a result set would be an utter disaster, and
> should I ever want it, set(d.values()) is absolutely trivial to code.

Agree. This reason alone is enough for values() to be a list, besides 
the problem that it can contain unhashable values.

> Note that in both cases I don't need a LIST, just an ITERATOR, so
> itervalues would do just as well (and probably faster) except it looks
> more cluttered and by a tiny margin less readable -- "the sum of the
> values" rolls off the tongue, "the sum of the itervalues" doesn't!-)
> So, the direction of change for Python 3000, when backwards
> compatibility can be broken, is to return iterators rather than lists.
> At that time, should you ever want a list, you'll say so explicitly, as
> you do now when you want a set, i.e. list(d.values())

Sounds reasonable.

> In Python today 'in' doesn't necessarily mean set membership, but some
> fuzzier notion of "presence in container"; e..g., you can code
> 
> 'zap' in 'bazapper'
> 
> and get the result True (while 'paz' in 'bazapper' would be False, even
> though, if you thought of the strings as SETS rather than SEQUENCES of
> characters, that would be absurd).  So far, strings (plain and unicode)
> are the only containers that read 'in' this way (presence of subsequence
> rather than of single item), but one example should be enough to show
> that "set membership" isn't exactly what the 'in' operator means.

In this case, I would interpret the set on which 'in' operates as the 
set of all substrings of the given string to have peace for my soul ;-)

> You appear to have a strange notion of "derived data type".  In what
> sense, for example, would a list BE-A set?  It breaks all kind of class
> invariants, e.g. "number of items is identical to number of distinct
> items", commutativity of addition, etc..

Sorry. Your're right. In the computer world, sets are derived from lists 
(collections). In mathematics, lists are derived from sets. Here, you 
would define the list [1, 1] as the set {1, {1, 1}} (you see, the 
cardinality is the same). Yes, mathematics is strange ;-) Actually, in 
mathematics, everything is a set and set theory is the "theory of 
everything". When I grew up pedagogues here in Germany even believed it 
would be best if kids learn set theory and draw venn diagrams before 
they learn arithmetics... We were tortured with that kind of things in 
the first class. Probably I'm still suffering from the late damages of 
that treatment.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why is dictionary.keys() a list and not a set?

Mike Meyer schrieb:

> If the hash changes, you've screwed up the set. What it really should
> say is "collection of objects with fixed hashes", or words to that
> effect. Do you want to file a PR on this?

I fear I have not enough understanding of Python's hashing concepts to 
make a suggestion for improvement here.

> How so? As I demonstrated, you can subclass any class that doesn't
> have a hash to add one, and then use the subclass, which - except for
> having a hash - will have exactly the same behavior as your original
> class.

But you surely wouldn't want to do this to the list of items just to be 
able to return it as a set.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why is dictionary.keys() a list and not a set?

Alex Martelli wrote:

> An alternative theory, of course, is "God made the natural numbers; all
> else is the work of man" -- and that one is by a German, too (Kronecker,
> if I recall correctly).

Yes, it was Kronecker. But even natural numbers are usually constructed 
with sets using Peano's axioms.

 > The hope to found all of mathematics on set theory was primarily a
 > _British_ effort, as I see it (Russell and Whitehead), and failed a
 > long time ago... I'm not sure what, if anything, a mathematician of
 > today would propose as the foundational theory...

Perhaps "string theory" ;-) So probably strings should become the 
fundamental datatype in Python (immutable strings of course) :-)

> But OO really requires a different mindset, particularly when operating
> under a regime of "mutable" objects.  "A circle IS-AN ellipse" in
> Euclidean geometry... but inheriting Circle from Ellipse doesn't work in
> OO if the objects are changeable, since you can, e.g., change
> eccentricity in an Ellipse but not in a Circle...

Good example.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why is dictionary.keys() a list and not a set?

Mike Meyer wrote:

 > Well, I just made a suggestion. You found the problem - you want to do
 > the PR, or should I?

Please go ahead if you have the time. By the way, the doco may be also 
inexact about the keys of dictionaries.

> # Untested code
> class Hash(object):
>def __new__(cls, obj):
>   if hasattr(obj, '__hash__'):
>  return obj
>   self.__obj = obj
>   return object.__new__()
>def __getattr__(self, name):
>return getattr(self.__obj, name)
>def __setattr(self, name, value):
>setattr(self.__obj, name, value)
>def __hash__(self):
>return id(self.__obj)

This will not work since even lists seem to have a __hash__ attribute. 
Also, you will get an infinite recursion when trying to access 
self.__obj. But principally, it should work. Ad hoc solution:

class HashProxy:
 def __init__(self, obj):
 self.__obj = obj
 def __getattr__(self, name):
 return getattr(self.__obj, name)
 def __setattr(self, name, value):
 setattr(self.__obj, name, value)
 def __hash__(self):
 return id(self.__obj)

def Hash(obj):
 try:
 hash(obj)
 except TypeError:
 return HashProxy(obj)
 else:
 return obj

 > If you need to turn a list of arbitrary, possibly unhashable, objects
 > into a set, is there a problem with the above?

Well, first you don't get the real unhashable objects, but proxied 
objects. This will lead to problems if you want to check the type in the 
following - you will always get the same type. Second, as you have 
already mentioned, the elements of the sets will not be guaranteed to be 
different anymore (only different in terms of identity (is), but they 
still may be equal (==)). This would not be what you expect from a set.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why is dictionary.keys() a list and not a set?

Martin v. Löwis schrieb:

 > Your original question was "could it be changed, and should it be
 > changed?" As the answer to the first part is already "no", the
 > second part really doesn't raise.

Right of course. The second part of the question was rather hypothetical 
(in the sense of "if we could start with Python from scratch, would it 
then be desirable").

 > In a set, all elements are different. In Python, this means that, for
 > any two elements X and Y of a set, X!=Y. Now, consider this set:
 > a = [1]
 > b = [1,2]
 > S = set(a,b)
 > a.append(2)
 > Now, a==b, so the inherent set property breaks. In theory, this should
 > cause S to contain only a single element; the other one should
 > disappear.

I just started to see these problems as well. I believed that the 
immutability of set elements had only technical reasons (hashing etc.) 
but your're right, it has also semantical reasons. In the above case, a 
or b would have to magically disappear, and even if that would be 
possible it would be completely unclear which of the two, and what would 
happen if you subsequently do a.append(3)? Should it magically appear 
again? So I understand you will get into very hot water if you try to 
implement sets with mutable elements.

 > This is not only hard to implement (removal from a set
 > would have to remove all duplicates, iterating over a set would have
 > to skip over duplicates) - there is also another semantical issue:
 > If one element is skipped/dropped, which of these (a or b)?

I should have read your posting fully before writing the above ;-)

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why is dictionary.keys() a list and not a set?

Paul Rubin schrieb:

> Yes, it's not considered terribly troublesome.  Set theory (and
> arithmetic) are generally accepted as being consistent but incomplete.
> So there will always be something for mathemeticians to do ;-).

Somehow I found this being comforting. Even in the realm of Mathematics 
there are things which can only be believed, but not be proven. Just as 
in Physics there are things which can only be predicted with a certain 
probability, but they are not predetermined completely.

But now we're really getting off topic.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why is dictionary.keys() a list and not a set?

Martin v. Löwis wrote:

> It follows from what is documented. set() creates a
> set that contains all elements in the iterable object:
> http://docs.python.org/lib/built-in-funcs.html#l2h-63
> Now, is a dictionary an iterable object? Yes, it is:
> http://www.python.org/peps/pep-0234.html
> Together, this gives the property I demonstrated.

You need to know a third thing, namely that as an iterable object, a 
dictionary returns the keys only, not the key/value tuples which would 
also make some sense. If it had been designed that way, you could write

for k, v in d:
print k, v

instead of:

for k in d:
print k, d[k]

What I wanted to say is that the doco could mention this possibility to 
get the keys as a set at the place where it explains the keys() method.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Le ruego me perdone.

replace('haber', random.choice('tener', 'hacer', 'lograr'))

Mi espanol es peor que mi python.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why is dictionary.keys() a list and not a set?

[EMAIL PROTECTED] schrieb:

> And this, again from the doc(about mapping objects):
> A mapping object maps immutable values to arbitrary objects.
> Seems that is questionable too.
> a=(1,[])
> d={}
> d[a]=1
> again would give TypeError, list object are unhashable.

That's a good example showing that the term 'mutable' is not so 
well-defined as it may seem.

If you set b=[]; a=(1,b); should a be considered mutable (because you 
can change its value by changing b), or should it be considered 
immutable (because it is a tuple)?

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why is dictionary.keys() a list and not a set?

As a general note, I think it would be good to place the exact 
description in a footnote, since speaking about hashable objects, 
__hash__ and __cmp__ will certainly frighten off newbies and make it 
hard to read even for experienced users. The main text may lie/simplyfy 
a little bit.

Or as Donald Knuth once recommended:

"Simplify. Lie, if it helps. You can add the correct details later on, 
but it is essential to present the reader with something straightforward 
to start off with. So don’t be afraid to bend the facts initially where 
this leads to a useful simplification and then pay back the debt to 
truth later by gradual elaborations."

However, since the Python docs are a reference and not a textbook or 
manual, I think the debt should not be payed back "later", but 
immediately in a footnote.

The docs already follow this principle very well, e.g.:
http://docs.python.org/lib/typesmapping.html

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Fuzzyman wrote:

> That means making keys, values, and items custom objects.
> Creating a new instance would have the overhead of creating 4 new
> objects instead of just 1. Is the added convenience worth it ? (Plus
> the extra layers of method calls for each access).

I'm not sure about that either. But since you are using odict for 
convenience reasons anyway, and not for performance reasons, it would be 
consequential. Performance testing should be made anyway, so this could 
be tested as well. I think that creating these 3 additional objects will 
not matter much if the dict has more than a handful of items. And the 
extra layers will be only called if you really access these keys, values 
or items attributes which will not happen very frequently. Normally, you 
just loop over an ordered directory and acess keyed values.

> I'm not sure. It's a nice idea in terms of using it (we could just
> leave the sequence attribute as an alias for the new keys attribute -
> for backwards compatibility).

Yes, you could make it a deprecated feature.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why is dictionary.keys() a list and not a set?

Mike Meyer schrieb:

> Ok, how about this for dictionaries/sets:
> 
> Any hashable object can be used as a dictionary key (set member). Immutable
> objects, except for tuples that contain a non-hashable object, are
> hashable. Python classes are normally hashable(1).

I would be even more simple here, like that:

The keys can be arbitrary immutable objects. Thus lists, sets or 
dictionaries are not allowed as keys. Tuples are allowed if they do not 
directly or indirectly contain mutable objects. More exactly, the keys 
are required to be hashable (1).

And then go on and define "hashable" in the footnot.

I think that is easy and understandable and covers the basic cases.

> And the footnote is:
> 
> Instances of Python classes are hashable if they define a __hash__
> method, or if they don't define either __cmp__ or __eq__.

I think that's not exact enough. For instance, ordinary lists also 
define a __hash__ method but are not hashable since that method just 
raises an exception. Also, as Martin pointed out, if both is there 
(__hash__ and __cmp__/__eq__) you would also require of a "proper" 
__hash__ method that equal objects hash the same. Otherwise, semantics 
would suffer, you could have dicts with non-unique keys (i.e. keys which 
are only distinct as objects, but not different as values).

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Tom Anderson schrieb:

> Maybe we should call it a 'sequenced dictionary' to fit better with 
> pythonic terminology?

That's not such a bad idea. Note that it is called like that in the 
Python version of the "Programming Language Examples Alike Cookbook":

http://pleac.sourceforge.net/pleac_python/hashes.html#AEN250

Anyway, I still favor the more common term "ordered dictionary".

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

It seems everybody is in full agreement here.

I have the same mixed feeling about letting keys/values/items become 
both managed list attributes and still returning copies of the lists 
when called in the usual way as methods.

I don't know any precedent for doing things that way and i'm unsure 
whether it meets the Zen of Python. But anyway the idea is nice enough 
to try it out.

-- Chris
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Tom Anderson wrote:

> True, but that's not exactly rocket science. I think the rules governing 
> when your [] acts like a dict [] and when it acts like a list [] are 
> vastly more complex than the name of one attribute.

I think it's not really rocket science either to assume that an ordered 
dictionary behaves like a dictionary if you access items by subscription 
and like a list if you use slices (since slice indexes must evaluate to 
integers anyway, they can only be used as indexes, not as keys).

>> Python's strength is that you don't have to keep many details in mind 
>> because it has a small "basic vocabulary" and orthogonal use.

> No it isn't - it's in having a wide set of basic building blocks which 
> do one simple thing well, and thus which are easy to use, but which can 
> be composed to do more complex things.  What are other examples of this
 > kind of 'orthogonal use'?

I mean for instance that you can deal with a tuple in the same way as 
with a string and things like that. Concerning "small": Somebody coined 
the good slogan "Python fits my brain" but I could also say "Python fits 
*into* my brain" (I mean without the batteries of course ;-) - you 
pretty soon have all the built-in functins, datatypes and their use in 
your head. On the other side of course, Python is a huge field and you 
can discuss endlessly as we are doing here.

Anyway, my argument was not good (avoiding new attributes names in order 
to keep the vocabulary small).

And by the way, what both of us listed as strengths of Python will be 
probably claimed by protagonists of any other language as well.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why is dictionary.keys() a list and not a set?

Mike Meyer wrote:

 > The mutability - or immutability - of an object is irrelevant to
 > the question of whether or not they can be a dictionary key/set
 > element. The critical property is that they are hashable.
 > *Most* immutable builtin types are hashable, and no builtin
 > mutable types are hashable, which deserves a mention.

I think the problem we are facing is that you must distinguish between 
hashable/mutable *types* and hashable/mutable objects. You can have 
*types* like tuple which are hashable/immutable themselves, but can have 
*instances* that are mutable and not hashable, such as ([]).

For the built-in types I think objects are hashable if and only if they 
are immutable (not 100% sure, but can you give a counterexample?). 
That's why I got back to talk about mutability first. But I agree one 
should also explain the problem of non-built-in types.

Maybe one can split the explanation and talk about the built-in 
datatypes first and then explain the situation for user-defined types.

 >>>And the footnote is:
 >>>Instances of Python classes are hashable if they define a __hash__
 >>>method, or if they don't define either __cmp__ or __eq__.
 >>
 >>I think that's not exact enough. For instance, ordinary lists also
 >>define a __hash__ method but are not hashable since that method just
 >>raises an exception.
 >
 > Ordinary lists aren't Python classes.

In your first sentence, it was unclear whether "they" refered to the 
class or the instance. But anyway, what about the following class:

class mylist(list):
 pass

This class definitely has a __hash__ method. But instances of this class 
are not hashable.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why is dictionary.keys() a list and not a set?

2005-11-26 Thread Christoph Zwerschke

Mike Meyer wrote:

> Ok, how about this for dictionaries/sets:
> 
> Any hashable object can be used as a dictionary key (set member). Immutable
> objects, except for tuples that contain a non-hashable object, are
> hashable. Python classes are normally hashable(1).

The problem with the last sentence is that can be misunderstood (classes 
themselves as objects are always hashable) and has little information 
otherwise (what is "normal"?).

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why is dictionary.keys() a list and not a set?

2005-11-26 Thread Christoph Zwerschke

Mike Meyer wrote:

 > I think you're trying to tweak the wrong definition. Types are
 > immutable if their instances are immutable.

I'm not trying to tweak it, but to gain more clarity about what is 
actually meant when you talk about "mutable" types and objects and 
whether it suffices to use these terms as long as you speak about 
built-in datatypes.

Tuples have been considered an immutable type since generations, becaue 
the elements of a tuple cannot be changed (i.e. the elements themselves, 
not their values). Likewise, they should be considered a hashable type 
because they have a proper hash function that can be used if all 
elements are hashable. Still, you can create instances of this 
immutable/hashable data type which are mutable/not hashable. So in the 
docs it will be important to be clear and emphasize that the *objects* 
have to be immutable/hashable, not only their types.

 > So whether or not a tuple containing a mutable object is
 > immutable depends on whether you define a immutable as "the object
 > can't change" or "the objects value can't change".

Right - if you have an actual tuple instance like (a,) and a is a list, 
you can argue whether it should be considered mutable or not. But you 
cannot argue whether it is hashable or not. It's simply not hashable. In 
so far, speaking about whether an object is hashable is more precise 
(well-defined) than asking whether it is mutable, you're right.

 > Do you think it's really necessary to specify "has a __hash__ method
 > that returns a value", as opposed to just "has a __hash__ method"?

I think it is necessary to require that it has a "proper" __hash__ 
function. As far as I understand you can always add a __hash__ method. 
Not only invalid __hash__ methods like in this case:

class mylist0(list):
 def __hash__(self): return None

But you can also easily add valid __hash__ methods:

class mylist1(list):
 def __hash__(self): return 0815

class mylist2(list):
 def __hash__(self): return id(self)

In the case of mylist1, everything is ok including semantics, but 
performance suffers dramatically. In mylist2, performance is great, but 
semantics suffers greatly.

Which of these user-defined types would you call "hashable"? Technically 
speaking, both probably are, so you can indeed use list objects of both 
types as keys of a dictionary - but I think there should be a hint in 
the docs that you need to have a "proper" hash function if you want your 
dictionary to be performant and show the usually expected behavior.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why is dictionary.keys() a list and not a set?

2005-11-26 Thread Christoph Zwerschke

>>> class mylist1(list):
>>> def __hash__(self): return 0815
>>>
>>> class mylist2(list):
>>> def __hash__(self): return id(self)
>>>
>>> Which of these user-defined types would you call "hashable"?

 > Mike Meyer wrote:

>> The latter two, as the given __hash__ meets the specifications for
>> __hash__.

Martin v. Löwis wrote:

> This is not true. The second definition of __hash__ does not meet
> the specifications:
> "The only required property is that objects which compare equal have the
> same hash value"

As Mike has written in his last posting, you could easily "fix" that by 
tweaking the equality relation as well. So technically speaking, Mike is 
probably right. It would completely break common-sense semantics, 
because for mylist2, if I have a=[1] and b=[1] then I will have a!=b. 
This would not be very reasonable, but probably allowed by Python.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why is dictionary.keys() a list and not a set?

Martin v. Löwis schrieb:

>> As Mike has written in his last posting, you could easily "fix" that 
>> by tweaking the equality relation as well. So technically speaking, 
>> Mike is probably right.
> 
> No. If you define both __hash__ and __eq__ consistently, then __hash__
> would meet the specification. As posted in the example, __hash__ does
> not meet the specification, contrary to Mike's claim that it does.

Ok, the example was incomplete, but in principle, this could be fixed by 
adding a tweaked __eq__ method. So for completeness sake:

class mylist2(list):
 def __hash__(self): return id(self)
 def __eq__(self, other): return self is other
 def __ne__(self, other): return self is not other

list = mylist2

a = list([1])
b = list([1])

print a, b, a == b, a != b

> If you have a=[1] and b=[1], then *always* a==b; you cannot
> change the semantics of list displays.

Surely, since you can't change the methods of built-in types. But you 
can create your own local list displays with a tweaked semantics as in 
the above example.

Anyway, the original question was: Are mylist1 and mylist2 (as above) to 
be considered "hashable" types or not? I think "technically" speaking, 
they are, and the "technical" definition is the only one that counts.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Bengt Richter wrote:

>>d.keys[:] = newkeyseq
> 
> Do you really mean just re-ordering the keys without a corresponding reording 
> of values??
> That would be a weird renaming of all values. Or do you means that any key 
> should still
> retrieve the same value as before if used as d[key]? In which case the values 
> must undergo
> the same permutation as the keys. I.e., you are assuming key->value pairings 
> remain stable
> through any key reorderings?

Since it is considered as being a dictionary in the first place, the 
key->value pairings should of course stay stable. In the usual 
implementation based on an ordinary dictionary with an additional key 
list ("sequence" in the Foord/Larosa and "_keys" in the Bejamin/Winter 
implementation), you would only set the key list, since the value list 
is generated dynamically. But if your implementation keeps internal 
values or items lists, these need to be adjusted as well.

I will assume that d has is a Foord/Larosa ordered dict with "sequence" 
attribute in the following.

Then, with other words,

d.keys[:] = newkeyseq

should do the same as:

d.sequence = newkeyseq

> Exactly what, though? should e.g.
> d.keys[3] =  newk3
> mean (not a suggested implementation, just to define semantics)
> keys = d.keys()
> if newk3 in keys and keys.index(newk3)!=3:
> raise ValueError,'Attempt to introduce duplicate key'
> items = d.items()
> items[3] = (newk3, items[3][1])
> d.clear()
> d.update(items)

Yes, that would be the correct semantics. Of course this should not be 
the real implementation and use KeyError instead of ValueError. With 
other words,

d.keys[i] = newkey

sould be the same as:

if d.sequence[i] != newkey:
 if newkey in d.sequence:
  raise KeyError,'Attempt to introduce duplicate key'
 else:
  d.sequence[i] = newkey

> This would allow what you might call renaming in place.
> Similarly
> d.keys[i:j] = newkeysij
> might have the semantics of
> keys = d.keys()
> outside = set(keys[:i])+set(keys[j:])
> if outside & set(newkeysij) or len(newkeysij) != len(set(newkeysij)):
> raise ValueError,'Attempt to introduce duplicate key(s)'
> items = d.items()
> items[i:j] = [(k, items[kx+i][1]) for kx,k in enumerate(newkeysij)]
> d.clear()
> d.update(items)
> 
> Is this what is desired?

Not quite, because it does not preserve the key->value pairings (see 
above) and it would behave strangely or raise an exception if the new 
slice is larger. The following code would do:

keys = d.keys()
outside = set(keys[:i])|set(keys[j:])
if outside & set(newkeysij) or len(newkeysij) != len(set(newkeysij)):
 raise ValueError,'Attempt to introduce duplicate key(s)'
items = d.items()
items[i:j] = [(k, d.get(k, None)) for k in newkeysij]
d.clear()
d.update(items)

(Note that there was a bug in the second line. You cannot add sets.)

Again, this would be equivalent to:

seq = d.sequence
newseq = seq[:]
newseq[i:j] = newkeysij
if len(newseq) != len(set(newseq)):
 raise KeyError,'Attempt to introduce duplicate key(s)'
for k in set(seq[i:j]) - set(newkeysij):
 del d[k]
for k in set(newkeysij) - set(seq[i:j]):
 d[k] = None
d.sequence = newseq

>>You don't keep track of the item lists, they need to be built on every 
>>occasion.
> 
> That depends on how you implement ;-)

Ok, I was thinking of the usual implementations.

> Back from holiday, so maybe I'll hack something out.

Let us know when you have something to check out.

Maybe Fuzzyman can make some moderate improvements to the existing 
odict.py, and you can do something more "radical". Then we have two 
"reference implementations" and can compare how they prove regarding 
performance and usability.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Christoph Zwerschke wrote:

> I will assume that d has is a Foord/Larosa ordered dict with "sequence" 
> attribute in the following.
> 
> Then, with other words,
> 
> d.keys[:] = newkeyseq
> 
> should do the same as:
> 
> d.sequence = newkeyseq

At least in the case where newkeyseq is a permutation of d.sequence.

Otherwise, it should behave like the given implementation for setting 
slices, i.e.

- if newkeyseq has duplicate elements, an exception should be raised.
- if newkeyseq has elements not in d.sequence, then the dictionary 
should be updated with corresponding None values
- if d.sequence has elements not in newkeyseq then these elements should 
be deleted from the dictionary

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

Bengt Richter schrieb:

> OTOH,
>  >>> {}[:]
>  Traceback (most recent call last):
>File "", line 1, in ?
>  TypeError: unhashable type
> I.e., slices are not valid keys for ordinary dicts, and slices tie in
> very well with the ordered aspect of ordered dicts, so that's an
> argument for permitting it via the indexing syntax, not just items[:]
> or items()[:] which have related but not identical semantics.

I see it like that. BTW, the above error message is pretty bad.

 > I wonder who is going to use it for what.

I think re-ordering will be a very rare use case anyway and slicing even 
more. As a use case, I think of something like mixing different 
configuration files and default configuration parameters, while trying 
to keep a certain order of parameters and parameters blocks.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why is dictionary.keys() a list and not a set?

Martin v. Löwis wrote:

> I long forgot was the original question was (I thought it was
> "Why is dictionary.keys() a list and not a set?" :-)

Er, yes. Probably we should have opened a new thread instead: 
"Improvement of the std lib doc concerning keys of sets/dicts".

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

General question about Python design goals

Sometimes I find myself stumbling over Python issues which have to do 
with what I perceive as a lack of orthogonality.

For instance, I just wanted to use the index() method on a tuple which 
does not work. It only works on lists and strings, for no obvious 
reason. Why not on all sequence types?

Or, another example, the index() method has start and end parameters for 
lists and strings. The count() method also has start and end parameters 
for strings. But it has no such parameters for lists. Why?

However when I ask such things I noticed I get answers like: "Is there a 
use case?" "You can do it some other way so it is not worth bothering."

Let me ask back: Do I really need to bother and justify it with a use 
case in a case where the language can be easily made more consistent or 
orthogonal without breaking anything?

What about design goals such as:

- orthogonality
- coherence, consistency
- principle of least astonishment ("Python fits my brain")
- simplicity ("kiss" principle)
- aesthetics, symmetry

Actually, which priority have the above design goals for Python? Are 
other design goals considered more important?

If I compare them with the "Zen of Python", I find some of the above:

consistency -> Special cases aren't special enough to break the rules
simplicity -> Simple is better than complex
aesthetics -> Beautiful is better than ugly

Actually, concerning the last two, you already need to understand the 
Zen of Python to decide if something is "simple" or even "beautiful", so 
they are not really suitable to *define* the Zen of Python. For me, a 
programming language is beautiful if it is orthogonal and coherent. But 
for others, this may be different. Somehow, I'm missing a direct 
allusion to the following in the Zen of Python:

- orthogonality
- principle of least astonishment

Maybe I am I lacking the satori of a real Python Zen master?

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: General question about Python design goals

>>Let me ask back: Do I really need to bother and justify it with a use 
>>case in a case where the language can be easily made more consistent or 
>>orthogonal without breaking anything?

Robert Kern wrote:

> Yes. If it's not going to be used, then there's not much point.
> Practicality beats purity, and all that.

I have nothing against "practicality beats purity". But to stay with the 
examples I have given, in how far is writing list(t).index(x) more 
practical than t.index(x)?

And by the way, if we are speaking about practicality here, are we 
speaking about practicality for the Python developers or for the Python 
users? This may be sometimes be in opposition.

Let me give an extreme example: Assume the count() method for strings 
would have been called "numberofsubstringsinthestring()" and I argue it 
should be called "count()" because that is shorter and how it is called 
for lists. What should I give as a "use case" here? Must I give "use 
cases" if the issue is about convenience/simplicity/elegance?

> However, I will note that if you were to present us with a working patch
> with documentation and unittests, then you'll probably get responses
> along the lines of "Thank you!", instead.

Not everybody has the time and skills to provide that. Ordinary people 
from the Python user base should be given the opportunity to make 
suggestions for improvement - of course not in the tone of "I 
demand...", but they should not be automatically regarded as "demanding" 
if they just utter their opinion or make a suggestion either.

Even if I had the time and skills, before starting to work on a patch, 
unittests etc. I still would first discuss with others whether my 
suggestion is reasonable and would be appreciated by the "user base", 
developers and the BDFL. Just take the example of the following patch:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=403693&group_id=5470

It was not rejected by reason of missing unittests or documentation, but 
the very idea was rejected.

Actually, I'm not keen on being a protagonist for this issue or starting 
a hard-bitten discussion about this and defend the idea if everybody is 
against or nobody cares. It does not bother me that much either.

But it just led me to the general question: Which significance actually 
have design features such as orthogonality for Python?

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: General question about Python design goals

>>For instance, I just wanted to use the index() method on a tuple which 
>>does not work. ...

Aahz wrote:

> Because Guido believes that tuples should be primarily used as
> lightweight replacements for C structs.  Therefore they have minimal
> functionality.

But the problem is that the tutorials and manuals give the impression 
that the difference between lists and tuples is only mutablity versus 
immutability. They don't talk about such considerations and honestly 
speaking even now I know that it does seem more plausible for me.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: New docs for set elements/dictionary keys

Mike Meyer wrote:

> Any object for which hash() returns an appropriate value(1) can be
> used as a dictionary key/set element. Lists, sets and dicts are not
> hashable, and can not be used. Tuples can be used if all the things
> they contain are hashable. instances of all other builin types can be
> used. Instances of most classes written in Python can be used(2).
> 
> 1) See the __hash__ documentation for details on what an approriate
> value is.
> 
> 2) Instances that have a __hash__ method that returns an appropriate
> value can be used. Instances that don't have a __cmp__ or an __eq__
> method can be used even if they don't have a __hash__ method.

I think that is not so bad. How about this simplification:

Any hashable object(1) can be used as a dictionary key/set element. 
Lists, sets and dicts are not hashable, and can not be used. Tuples can 
be used if all the things they contain are hashable. Instances of all 
other built-in types and most user-defined classes are hashable.

(1) Objects for which the hash() function returns an appropriate 
(proper?) value. See the __hash__ documentation for details.

-- Christoph

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: General question about Python design goals

Fredrik Lundh schrieb:
> Christoph Zwerschke wrote:
>>But the problem is that the tutorials and manuals give the impression
>>that the difference between lists and tuples is only mutablity versus
>>immutability.
>
> both the tutorial and the FAQ discusses the difference in terms of use
> cases and recommended usage.

The tutorial does not speak about differences other than mutability. But 
you're right, the C structs analogy is mentioned in the FAQ.

>>Maybe I am I lacking the satori of a real Python Zen master?
> 
> I suggest you look up the phrase "bike shed effect".  next, go read some
> recent PEP:s to see what's really going on in the Python design universe.

The bike shed effect is a good explanation and so true. About 8 years 
ago when I started to work at my current working place at the university 
I suggested that a bike shed should be provided for people like me. 
Since then, a lot of million euro projects have been carried out, like 
introducing SAP software and new projects are in progress. But my bike 
still is getting wet and anyway, it's still bothering me.

-- Christoph

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: General question about Python design goals

Aahz wrote:
>Christoph wrote:
>>Aahz wrote:
>>>Christoph wrote:
For instance, I just wanted to use the index() method on a tuple which 
does not work. ...
>>>
>>>Because Guido believes that tuples should be primarily used as
>>>lightweight replacements for C structs.  Therefore they have minimal
>>>functionality.
>>
>>But the problem is that the tutorials and manuals give the impression 
>>that the difference between lists and tuples is only mutablity versus 
>>immutability. They don't talk about such considerations and honestly 
>>speaking even now I know that it does seem more plausible for me.
> 
> Then feel free to submit patches for the docs.

Why should I patch the docs to add things that don't seem plausible for 
me? I'd rather change the behavior to be more consistent instead 
tweaking the docs to somehow justify the inconsistent behavior.

> PS: If you want further responses from me, please follow standard Usenet
> quoting conventions (like those above).

I am tempted to answer, "A Foolish Consistency is the Hobgoblin of 
Little Minds" ;-)

No honestly, it was not in bad faith. I'm just not a regular usenet user 
and believed one attribution line per post should suffice. But reading 
the conventions I agree that they make sense.

And I think it's a good analogy. Some people dislike Python's stringent 
whitespace syntax and "one obvious way" rule because it does not give 
them freedom to write programs in their individual style. But I don't 
think it's "foolish consistency" - it simply makes programs of others 
more easy to read, you can concentrate on the content and not the style. 
It's the same reason why people use quoting conventions. And BTW, I 
think that makes particularly suited for open source and XP projects.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: General question about Python design goals

Fredrik Lundh schrieb:
> Christoph Zwerschke wrote:
> 
>>What about design goals such as:
>>
>>- orthogonality
>>- coherence, consistency
>>- principle of least astonishment ("Python fits my brain")
>>- simplicity ("kiss" principle)
>>- aesthetics, symmetry
>>
>>Actually, which priority have the above design goals for Python? Are
>>other design goals considered more important?
> - A Foolish Consistency is the Hobgoblin of Little Minds
> - Hypergeneralization Sucks

Ok, these are nice aphorisms with some truth. But I had to think of the 
German excuse "Wer Ordnung hält ist nur zu Faul zum Suchen - ein Genie 
überblickt das Chaos." ("Those who try to keep things tidy are just too 
lazy to search for searching - a genius surveys the chaos").

They remind you that consistency - as every good goal - can be 
overemphasized and exaggerated on the cost of other good goals, but 
otherwise they have little wisdom in the realm of programming.

In the original context (http://www.emersoncentral.com/selfreliance.htm) 
and IIUC, the meaning of the "foolish consistency" aphorism is:

"A great mind will not stupidly think/work consistently from one day to 
the next, but is able to throw away beloved ideas, beliefs, habits, 
styles etc. if he/she finds something that is better, more true or more 
appropriate under changed circumstances."

Try to transfer this to programming languages: "A great programming 
language does not have to function consistently from one version to the 
next, you can completely throw away a well-established syntax if you 
find something better." I think this would not really be desirable.

Another problem I have with that aphorisms applied to Python is that it 
has some undertone of "Python is only intended to be used by great 
minds" and making it usable and understandable for "little minds" is no 
design goal at all. That would be a very arrogant and elitarian view. 
Little minds like me should be able to say "Python fits my brain" just 
as great minds. Even kids can understand Python (there's a Germany book 
"Python for Kids", http://www.way2python.de/pythonbuch/kids.html) and I 
think i's not disqualifying Python as a "toy language", but I consider 
it rather a strength of Python. Ok, there is even a book "C++ for kids", 
but I think in reality it is intended for adult beginners...

Most of all, such solgans do not explain *when* consistency is to be 
considered foolish and when it is wise. Are those who suggest changes in 
order to make Python more consistent with itself foolish or are those 
foolish who reject this, saying that "Python has always been like that, 
nobody has cared in the past and we want to stay compatible". This could 
be also considered a "foolish consistency".

A programming language is not a "work of art". If you are an artist, you 
may break symmetry and introduce all kinds of unexpected effects. 
Actually, as an artist, you purposfully want to provoke astonishment. 
But if I am using a programming language or a user interface, I don't 
want to be confronted with inconsistent behavior. Here, the "principle 
of least astonishment" is much more helpful (in my little mind's humble 
optionion).

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: New docs for set elements/dictionary keys

Mike Meyer wrote:
> Christoph Zwerschke wrote:
>>I think that is not so bad. How about this simplification:
>>
>>Any hashable object(1) can be used as a dictionary key/set
>>element. Lists, sets and dicts are not hashable, and can not be
>>used. Tuples can be used if all the things they contain are
>>hashable. Instances of all other built-in types and most user-defined
>>classes are hashable.
>>
>>(1) Objects for which the hash() function returns an appropriate
>>(proper?) value. See the __hash__ documentation for details.
> 
> I think that will do quite nicely, as the information about __hash__,
> __cmp__ and __eq__ are all in the __hash__ documentation. You wrote it
> - why don't you submit a PR with it?

Actually you wrote it ;-) Do suggestions like this one go to the normal 
Python sf bug tracker?

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: New Ordered Dictionery to Criticise

Fuzzyman wrote:
> Sorry for this hurried message - I've done a new implementation of out
> ordered dict. This comes out of the discussion on this newsgroup (see
> blog entry for link to archive of discussion).

Thanks. I'll try to check it out and put my oar in over the next 
weekend. One thing I already noticed: str() and repr() both output the 
OrderedDict as if it was an ordinary dictionary. For str() this may be 
acceptable, but repr() should output "a valid Python expression that 
could be used to recreate an object with the same value".

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: General question about Python design goals

Peter Hansen schrieb:
> Christoph Zwerschke wrote:
>
>> Ok, these are nice aphorisms with some truth. But I had to think of 
>> the German excuse "Wer Ordnung hält ist nur zu Faul zum Suchen - ein 
>> Genie überblickt das Chaos." ("Those who try to keep things tidy are 
>> just too lazy to search for searching - a genius surveys the chaos").
> 
> 
> Is that really the translation?  "too lazy to search for searching"?
> What does that mean?

Sorry. Cancel either "to search" or "for searching". I noticed I made a 
lot of typos in that posting. In the German sentence, there should be 
comma after "ist", and "faul" should be written with a lowercase "f". 
Probably this caused additional confusion for Google...

-- Christoph

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: General question about Python design goals

Christoph Zwerschke wrote:
> there should be comma after "ist",

Sorry, *before* "ist". I should probably stop posting for today...

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: General question about Python design goals

2005-11-29 Thread Christoph Zwerschke

Mike Meyer wrote:
> Christoph Zwerschke wrote:
>>A programming language is not a "work of art". If you are an artist,
>>you may break symmetry and introduce all kinds of unexpected
>>effects. Actually, as an artist, you purposfully want to provoke
>>astonishment. But if I am using a programming language or a user
>>interface, I don't want to be confronted with inconsistent
>>behavior. Here, the "principle of least astonishment" is much more
>>helpful (in my little mind's humble optionion).
> 
> But a programming language (or UI) is not just a collection of syntax
> and and interfaces - it's an implementation. You need to keep in mind
> that "practicality beats purity". If following POLA makes the
> implementation an order of magnitude slower or larger, then you don't
> follow POLA - at least until you can do it without that cost.

I do not deny this. As an example, if tuples can be used as keys of 
dicts, you might first naively assume that lists could also work. This 
is not the case for practicality (performance) reasons and because it 
would be difficult or impossible to guarantee the uniqueness of keys. In 
this case, POLA is overruled by other goals or inherent necessities.

But that does not mean that POLA in itself is not an important guiding 
principle for programming languages (or UI).

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: General question about Python design goals

2005-11-29 Thread Christoph Zwerschke

Fredrik Lundh wrote:
> on the other hand, it's also possible that there are perfectly usable ways
> to keep bikes and bike seats dry where Christoph works, but that he prefers
> not to use them because they're violating some design rule.

Depends on how you understand "perfectly usable." My collegue always 
carries his expensive racing bike to our office in the 3rd floor out of 
fear it may get wet or stolen. But I think this is not very convenient 
and I want to avoid discussions with our boss about skid marks on the 
carpet and things like that. Probably that collegue would not complain 
as well if he had to cast tuples to lists for counting items - you see, 
people are different ;-)

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why are there no ordered dictionaries?

2005-11-29 Thread Christoph Zwerschke

I had the same idea to create a py.test to verify and compare various 
implementations. The doctests in odict.py are nice, but you can't use 
them for this purpose and they may not test enough. It would be also 
good to have something for testing and comparing performance.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: General question about Python design goals

2005-11-30 Thread Christoph Zwerschke

[EMAIL PROTECTED] wrote:
> Paul Rubin wrote:
>>Look at the list.count() example at the start of this thread.
>>Diagnosing it isn't hard.  Curing it isn't hard.  It doesn't bloat
>>Python by an order of magnitude.  A suitably factored implementation
>>might handle lists and strings with the exact same code and not incur
>>any extra cost at all.  That type of thing happens all the time here.
> 
> I believe the language creator use the "lack of" as a way to
> prevent/discourage that kind of usage. Just like the ternary
> operator(still don't know why it is finally accepted). It is not a
> problem(not having), it is a feature(to teach you program better), so
> what cure are we talking about ?

Sorry, but I still do not get it. Why is it a feature if I cannot count 
or find items in tuples? Why is it bad program style if I do this? So 
far I haven't got any reasonable explanation and I think there is no.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: General question about Python design goals

Donn Cave schrieb:
> As I'm sure everyone still reading has already heard, the natural usage
> of a tuple is as a heterogenous sequence.  I would like to explain this
> using the concept of an "application type", by which I mean the set of
> values that would be valid when applied to a particular context.  For
> example, os.spawnv() takes as one of its arguments a list of command
> arguments, time.mktime() takes a tuple of time values.  A homogeneous
> sequence is one where  a  and  a[x:y]  (where x:y is not 0:-1)  have
> the same application type.  A list of command arguments is clearly
> homogeneous in this sense - any sequence of strings is a valid input,
> so any slice of this sequence must also be valid.  (Valid in the type
> sense, obviously the value and thus the result must change.)  A tuple
> of time values, though, must have exactly 9 elements, so it's heterogeneous
> in this sense, even though all the values are integer.

I understand what you want to say, but I would not use the terms 
"homogenuous" or "heterogenous" since their more obvious meaning is that 
all elements of a collection have the same type.

What you are calling an "application type" is a range of values, and the 
characteristic you are describing is that the range of values is not 
left when you slice (or extend) an object. So what you are describing is 
simply a slicable/extendable application type. It is obvious that you 
would use lists for this purpose, and not tuples, I completely agree 
with you here. But this is just a consequence of the immutability of 
tuples which is their more fundamental characteristic.

Let me give an example: Take all nxn matrices as your application type. 
That applicaiton type is clearly not slicable/extendable, because this 
would change the dimension, thus not "heterogenous" in your definition.

So would you use tuples (of tuples) or lists (of lists) here? Usually 
you will use lists, because you want to be able to operate on the 
matrices and transform them in place. So you see the more fundamental 
characteristic and reason for prefering lists over tuples is mutability.

Let us assume you want to calculate the mathematical rank of such a 
matrix. You would bring it in upper echelon shape (here, you are 
operating on the rows, thus you would use lists) and then you would 
count the all-zero rows. Ok, this is not an example for using count() on 
tuples, but it is an example for using count() on a "heterogenous" 
collection in your definition.

I completely agree that you will need count() and item() much less 
frequently on tuples because of their immutability. This is obvious. 
(Tuples themselves are already used less frequently than lists for this 
reason.) But I still cannot see why you would *never* use it or why it 
would be bad style.

And I don't understand why those who smile at my insistence on design 
principles of consistence - propagating practicability instead - are 
insisting themselves on some very philosophical and non-obvious design 
principles or characteristics of tuples.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: General question about Python design goals

I think this all boils down to the following:

* In their most frequent use case where tuples are used as lightweight 
data structures keeping together heterogenous values (values with 
different types or meanings), index() and count() do not make much sense.

I completely agree that his is the most frequent case. Still there are 
cases where tuples are used to keep homogenous values together (for 
instance, RGB values, points in space, rows of a matrix). In these cases 
it would be principally useful to have index() and count() methods.

But:

* Very frequently you will use only 2- or 3-tuples, where direct queries 
may be faster than item() and count(). (That's probably why Antoon's RGB 
example was rejected as use case though it was principally a good one).

* Very frequently you want to perform operations on these objects and 
change their elements, so you would use lists instead of tuples anyway. 
See my use case where you would determine whether a vector is zero by 
count()ing its zero entries or the rank of a matrix by count()ing zero rows.

* You will use item() and count() in situations where you are dealing 
with a small discrete range of values in your collection. Often you will 
use strings instead of tuples in these cases, if you don't need to sum() 
the items, for instance.

So, indeed, very few use cases will remain if you filter throught the 
above. But this does not mean that they do not exist. And "special cases 
aren't special enough to break the rules." It should be easy to imagine 
use cases now.

Take for example, a chess game. You are storing the pieces in a 
64-tuple, where every piece has an integer value corresponding to its 
value in the game (white positive, black negative). You can approximate 
the value of a position by building the sum(). You want to use the tuple 
as a key for a dictionary of stored board constellations (e.g. an 
opening dictionary), therefore you don't use a list.

Now you want to find the field where the king is standing. Very easy 
with the index() method. Or you want to find the number of pawns on the 
board. Here you could use the count() method.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: General question about Python design goals

Chris Mellon schrieb:
> First, remember that while your idea is obvious and practical and
> straightforward to you, everybodys crummy idea is like that to them.
> And while I'm not saying your idea is crummy, bear in mind that not
> everyone is sharing your viewpoint.

That's completely ok. What I wanted to know is *why* people do not share 
my viewpoint and whether I can understand their reasons. Often, there 
are good reasons that I do understand after some discussion. In this 
case, there were some arguments but they were not convincing for me.

I think the rest of your arguments have been already discussed in this 
thread. People seem to have different opinions here.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: General question about Python design goals

Fredrik Lundh wrote:
> Christoph Zwerschke wrote:
> now, I'm no expert on data structures for chess games, but I find it hard to
> believe that any chess game implementer would use a data structure that re-
> quires linear searches for everything...

Using linear arrays to represent chess boards is pretty common in 
computer chess. Often, the array is made larger than 64 elements to make 
sure moves do not go off the board but hit unbeatable pseudo pieces 
standing around the borders. But in principle, linear arrays of that 
kind are used, and for good reasons.

I already feared that a discussion about the details and efficiency of 
this implementation would follow. But this was not the point here.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: General question about Python design goals

Steve Holden wrote:
> Christoph Zwerschke wrote:
>> I completely agree that his is the most frequent case. Still there are 
>> cases where tuples are used to keep homogenous values together (for 
>> instance, RGB values, points in space, rows of a matrix). In these 
>> cases it would be principally useful to have index() and count() methods.
>>
> Why? Why does it make sense to ask whether an RGB color has a particular 
> value for one of red, green or blue? Why does it make sense to ask how 
> many elements there are in an RGB color? It doesn't, so you must be 
> talking about (ordered) *collections* of such items.
 >
> If you want a list of RGB colors then use a list. If you want a list of 
> points in space then use a list. Why is a tuple preferable? [If the 
> answer is "because a tuple can't be changed" go to the bottom of the 
> class].

I cannot follow you here. How would you store RGB values? I think they 
are a perfect use case for tuples in the spirit of Guido's "lightweight 
C structs." So, in the spirit of Guido I would store them as tuples, or 
return them as tuples by a function getpixel(x,y). Why should I not be 
allowed to check for getpixel(xy).count(0) == n for black pixels in an 
image with n layers? Yes, you could set BLACK=(0,)*n and test against 
BLACK. You can always do things differently.

>> Take for example, a chess game. You are storing the pieces in a 
>> 64-tuple, where every piece has an integer value corresponding to its 
>> value in the game (white positive, black negative). You can 
>> approximate the value of a position by building the sum(). You want to 
>> use the tuple as a key for a dictionary of stored board constellations 
>> (e.g. an opening dictionary), therefore you don't use a list.
>>
> This is a pretty bogus use case. Seems to me like a special case it's 
> not worth breaking the rules for!

I already explained why use cases for count() and index() on tuples will 
principally be rare. So it will always be easy for you to call them 
"special cases". But they are there, and they make sense.

Also, we are not talking about "breaking" a rule or any existing code 
here, but about generalizing or *broadening* a rule. (What do you do if 
a rule itself is broken?)

Here is another example illustrating the problem - coincidentally, from 
Fredrik's blog, http://effbot.org/zone/image-histogram-optimization.htm
(found it when googling for getpixel):

def histogram4(image):
 # wait, use the list.count operator
 data = list(image.getdata())
 result = []
 for i in range(256):
 result.append(data.count(i))
 return result

Here, we could imagine the getdata() method returning a tuple. Again, 
why must it be casted to a list, just to use count()? In reality, 
getdata() returns a sequence. But again, wouldn't it be nice if all 
sequences provided count() and index() methods? Then there would be no 
need to create a list from the sequence before counting.

-- Christoph
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: General question about Python design goals