Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-05 Thread Peng Yu
> Documentation that takes ten pages to say something is just as bad as
> documentation that leaves stuff out, because it's almost guaranteed
> that it won't be read.

That's the point. If a simple example (6 lines) can demonstrate the
concept, why spending "ten pages" to explain it. My experience is that
for certain things, it is better describe by a spec once you know it,
but it is certainly not true for people to learn it. A reasonable
strategy is to interleave spec with demonstrating examples. There is
no excuse to not to make the manual easier to read.

-- 
Regards,
Peng
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-05 Thread Emile van Sebille

On 5/5/2012 4:04 AM Peng Yu said...


I agree that people have different opinions on issues like this. But I
think that "The Customer Is God". Readers of the doc is the customers,
the writers of the doc is the producers. The opinion of customers
should carry more weight than producers.


Only to a point.  The experience of the producers should carry more 
weight than the opinion of the customers.


Emile




--
http://mail.python.org/mailman/listinfo/python-list


Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-05 Thread Chris Angelico
On Sat, May 5, 2012 at 9:04 PM, Peng Yu  wrote:
> I agree that people have different opinions on issues like this. But I
> think that "The Customer Is God". Readers of the doc is the customers,
> the writers of the doc is the producers. The opinion of customers
> should carry more weight than producers.

Nah, the customer's not God, and I have proof. http://notalwaysright.com/

Oops, now everyone's off reading funny stories about stupid/abusive
customers. Well, when you're all back...

The reason you're reading documentation is to learn. You're not
handing over wads of cash and saying "Do stuff for me"; you're reading
the Player's Handbook and learning which dice to roll when. Perhaps
there's some jargon that you don't understand; in that case, either a
FAQ/glossary or a forum post will set you straight. But if the doc
says that something can't be relied upon, then there's nothing more to
write there.

Documentation that takes ten pages to say something is just as bad as
documentation that leaves stuff out, because it's almost guaranteed
that it won't be read.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-05 Thread Peng Yu
Hi Terry,

Thank you for you detailed email.

> If two collections are equal, should the iteration order be the same? It has
> always been true that if hash values collide, insertion order matters.
> However, a good hash function avoids hash collisions as much as possible in
> practical use cases. Without doing something artificial, as I did with the
> example, collisions should be especially rare on 64-bit builds. If one
> collection has a series of additions and deletions so that the underlying
> hash table has a different size than an equal collection build just from
> insertions, then order will also be different.

The reason that I asked to add the artificial example in the doc is
because I never completely understand the unorderness until I see your
artificial example. You will surely use more words for explaining what
"unorderness" means than just showing your example. And the example
(since formatted as code) is more eye catching than just plain text.

For my case, since I didn't understand the unorderness, I made some
subtle bug in my program, which works fine in my testing code.
However, it produce a small amount of corrupted results for the real
production use, which is harder to debug. It did wasted quite some of
my time.

> For the doc, the problem is what to say and where without being repetitous
> (and to get multiple people to agree ;-).

I agree that people have different opinions on issues like this. But I
think that "The Customer Is God". Readers of the doc is the customers,
the writers of the doc is the producers. The opinion of customers
should carry more weight than producers.

-- 
Regards,
Peng
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-04 Thread Terry Reedy

Peng, I actually am thinking about it.

Underlying problem: while unordered means conceptually unordered as far 
as the collection goes, the items in the collection, if homogenous 
enough, may have a natural order, which users find hard to ignore. Even 
if not comparable, an implementation such as CPython that uses linear 
sequential memory will impose some order. Even if the implementation 
uses unordered (holographic?) memory, order will be imposed to iterate, 
as when creating a serialized representation of the collection. Abstract 
objects, concrete objects, and serialized representations are three 
different things, but people tend to conflate them.


Order consistency issues: if the unordered collection is iterated, when 
can one expect the order to be the same? Theoretically, essentially 
never, except that iterating dicts by keys, values, or key-value pairs 
is guaranteed to be consistent, which means that re-iterating has to be 
consistent. I actually think the same might as well be true for sets, 
although there is no doc that says so.


If two collections are equal, should the iteration order be the same? It 
has always been true that if hash values collide, insertion order 
matters. However, a good hash function avoids hash collisions as much as 
possible in practical use cases. Without doing something artificial, as 
I did with the example, collisions should be especially rare on 64-bit 
builds. If one collection has a series of additions and deletions so 
that the underlying hash table has a different size than an equal 
collection build just from insertions, then order will also be different.


If the same collection is built by insertion in the same order, but in 
different runs, bugfix versions, or language versions, will iteration 
order by the same? Historically, it has been for CPython for about a 
decade, and people has come to depend on that constancy, in spite of 
warning not to. Even core developers have not been immune, as the 
CPython test suite has a few set or dict iteration order dependencies 
until it was recently randomized.


Late last year, it became obvious that this constancy is a practical 
denial-of-service security hole. The option to randomize hashing for 
each run was added to 2.6, 2.7, 3.1, and 3.2. Randomized hashing by run 
is part of 3.3. So some of the discussion above is obsolete. The example 
I gave only works for that one run, as hash('a') changes each run. So 
iteration order now changes with each run in fact as well as in theory.


For the doc, the problem is what to say and where without being 
repetitous (and to get multiple people to agree ;-).


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-04 Thread Mark Lawrence

On 05/05/2012 00:37, Peng Yu wrote:


My point is if something is said in the document, it is better to be
substantiated by an example. I don't think that this has anything with
"learn the spec from behaviour."



I side with the comments made by Terry Reedy and Cameron Simpson so 
please give it a rest, you're flogging a dead horse.


--
Cheers.

Mark Lawrence.

--
http://mail.python.org/mailman/listinfo/python-list


Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-04 Thread Peng Yu
On Fri, May 4, 2012 at 6:12 PM, Cameron Simpson  wrote:
> On 04May2012 15:08, Peng Yu  wrote:
> | On Fri, May 4, 2012 at 12:43 PM, Terry Reedy  wrote:
> | > On 5/4/2012 8:00 AM, Peng Yu wrote:
> | >> On Fri, May 4, 2012 at 6:21 AM, Chris Angelico  wrote:
> | >>> On Fri, May 4, 2012 at 8:14 PM, Peng Yu  wrote:
> |  Thanks. This is what I'm looking for. I think that this should be
> |  added to the python document as a manifestation (but nonnormalized) of
> |  what "A set object is an unordered collection of distinct hashable
> |  objects" means.
> | >>>
> | >>> There are other things that can prove it to be unordered, too; the
> | >>> exact pattern and order of additions and deletions can affect the
> | >>> iteration order. The only thing you can be sure of is that you can't
> | >>> be sure of it.
> | >>
> | >> I agree. My point was just to suggest adding more explanations on the
> | >> details in the manual.
> | >
> | > I am not sure how much clearer we can be in the language manual. The word
> | > 'unordered' means just that. [...]
> |
> | You can just add the example that you posted to demonstrate what the
> | unordered means. A curious user might want to know under what
> | condition the "unorderness" can affect the results, because for
> | trivial examples (like the following), it does seem that there is some
> | orderness in a set.
>
> I'm with Terry here: anything else in the line you suggest would
> complicate things for the reader, and potentially mislead.
>
> Future implementation changes (and, indeed, _other_ implementations like
> Jython) can change any of this. So there _are_ no ``condition the
> "unorderness" can affect the results'': a set is unordered, and you
> could even _legitimately_ get different orders from the same set
> if you iterate over it twice. It is unlikely, but permissable.
>
> Any attempt to describe such conditions beyond "it might happen at any
> time" would be misleading.
>
> | set(['a', 'b', 'c'])
> | set(['c', 'b', 'a'])
>
> The language does not say these will get the same iteration order. It
> happens that the Python you're using, today, does that.
>
> You can't learn the language specification from watching behaviour;
> you learn the guarrenteed behaviour -- what you may rely on happening --
> from the specification, and you can test that an implementation obeys (or
> at any rate, does not disobey) the specification by watching behaviour.
>
> You seem to be trying to learn the spec from behaviour.

My point is if something is said in the document, it is better to be
substantiated by an example. I don't think that this has anything with
"learn the spec from behaviour."

-- 
Regards,
Peng
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-04 Thread Cameron Simpson
On 04May2012 15:08, Peng Yu  wrote:
| On Fri, May 4, 2012 at 12:43 PM, Terry Reedy  wrote:
| > On 5/4/2012 8:00 AM, Peng Yu wrote:
| >> On Fri, May 4, 2012 at 6:21 AM, Chris Angelico  wrote:
| >>> On Fri, May 4, 2012 at 8:14 PM, Peng Yu  wrote:
|  Thanks. This is what I'm looking for. I think that this should be
|  added to the python document as a manifestation (but nonnormalized) of
|  what "A set object is an unordered collection of distinct hashable
|  objects" means.
| >>>
| >>> There are other things that can prove it to be unordered, too; the
| >>> exact pattern and order of additions and deletions can affect the
| >>> iteration order. The only thing you can be sure of is that you can't
| >>> be sure of it.
| >>
| >> I agree. My point was just to suggest adding more explanations on the
| >> details in the manual.
| >
| > I am not sure how much clearer we can be in the language manual. The word
| > 'unordered' means just that. [...]
| 
| You can just add the example that you posted to demonstrate what the
| unordered means. A curious user might want to know under what
| condition the "unorderness" can affect the results, because for
| trivial examples (like the following), it does seem that there is some
| orderness in a set.

I'm with Terry here: anything else in the line you suggest would
complicate things for the reader, and potentially mislead.

Future implementation changes (and, indeed, _other_ implementations like
Jython) can change any of this. So there _are_ no ``condition the
"unorderness" can affect the results'': a set is unordered, and you
could even _legitimately_ get different orders from the same set
if you iterate over it twice. It is unlikely, but permissable.

Any attempt to describe such conditions beyond "it might happen at any
time" would be misleading.

| set(['a', 'b', 'c'])
| set(['c', 'b', 'a'])

The language does not say these will get the same iteration order. It
happens that the Python you're using, today, does that.

You can't learn the language specification from watching behaviour;
you learn the guarrenteed behaviour -- what you may rely on happening --
from the specification, and you can test that an implementation obeys (or
at any rate, does not disobey) the specification by watching behaviour.

You seem to be trying to learn the spec from behaviour.

Cheers,
-- 
Cameron Simpson  DoD#743
http://www.cskk.ezoshosting.com/cs/

Loud pipes make noise. Skill and experience save lives.
- EdBob Morandi
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-04 Thread Peng Yu
On Fri, May 4, 2012 at 12:43 PM, Terry Reedy  wrote:
> On 5/4/2012 8:00 AM, Peng Yu wrote:
>>
>> On Fri, May 4, 2012 at 6:21 AM, Chris Angelico  wrote:
>>>
>>> On Fri, May 4, 2012 at 8:14 PM, Peng Yu  wrote:

 Thanks. This is what I'm looking for. I think that this should be
 added to the python document as a manifestation (but nonnormalized) of
 what "A set object is an unordered collection of distinct hashable
 objects" means.
>>>
>>>
>>> There are other things that can prove it to be unordered, too; the
>>> exact pattern and order of additions and deletions can affect the
>>> iteration order. The only thing you can be sure of is that you can't
>>> be sure of it.
>>
>>
>> I agree. My point was just to suggest adding more explanations on the
>> details in the manual.
>
>
> I am not sure how much clearer we can be in the language manual. The word
> 'unordered' means just that. If one imposes an arbitrary linear order on an
> unordered collection, it is arbitrary. It is frustrating that people do not
> want to believe that, and even write tests depending on today's arbitrary
> serialization order being deterministic indefinitely. There is a section
> about this in the doctest doc, but people do it anyway. I will think about a
> sentence to add.

You can just add the example that you posted to demonstrate what the
unordered means. A curious user might want to know under what
condition the "unorderness" can affect the results, because for
trivial examples (like the following), it does seem that there is some
orderness in a set.

set(['a', 'b', 'c'])
set(['c', 'b', 'a'])

-- 
Regards,
Peng
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-04 Thread Terry Reedy

On 5/4/2012 8:00 AM, Peng Yu wrote:

On Fri, May 4, 2012 at 6:21 AM, Chris Angelico  wrote:

On Fri, May 4, 2012 at 8:14 PM, Peng Yu  wrote:

Thanks. This is what I'm looking for. I think that this should be
added to the python document as a manifestation (but nonnormalized) of
what "A set object is an unordered collection of distinct hashable
objects" means.


There are other things that can prove it to be unordered, too; the
exact pattern and order of additions and deletions can affect the
iteration order. The only thing you can be sure of is that you can't
be sure of it.


I agree. My point was just to suggest adding more explanations on the
details in the manual.


I am not sure how much clearer we can be in the language manual. The 
word 'unordered' means just that. If one imposes an arbitrary linear 
order on an unordered collection, it is arbitrary. It is frustrating 
that people do not want to believe that, and even write tests depending 
on today's arbitrary serialization order being deterministic 
indefinitely. There is a section about this in the doctest doc, but 
people do it anyway. I will think about a sentence to add.


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-04 Thread Peng Yu
On Fri, May 4, 2012 at 6:21 AM, Chris Angelico  wrote:
> On Fri, May 4, 2012 at 8:14 PM, Peng Yu  wrote:
>> Thanks. This is what I'm looking for. I think that this should be
>> added to the python document as a manifestation (but nonnormalized) of
>> what "A set object is an unordered collection of distinct hashable
>> objects" means.
>
> There are other things that can prove it to be unordered, too; the
> exact pattern and order of additions and deletions can affect the
> iteration order. The only thing you can be sure of is that you can't
> be sure of it.

I agree. My point was just to suggest adding more explanations on the
details in the manual.

-- 
Regards,
Peng
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-04 Thread Chris Angelico
On Fri, May 4, 2012 at 8:14 PM, Peng Yu  wrote:
> Thanks. This is what I'm looking for. I think that this should be
> added to the python document as a manifestation (but nonnormalized) of
> what "A set object is an unordered collection of distinct hashable
> objects" means.

There are other things that can prove it to be unordered, too; the
exact pattern and order of additions and deletions can affect the
iteration order. The only thing you can be sure of is that you can't
be sure of it.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-04 Thread Peng Yu
On Thu, May 3, 2012 at 11:16 PM, Terry Reedy  wrote:
> On 5/3/2012 8:36 PM, Peng Yu wrote:
>>
>> Hi,
>>
>> list(a_set)
>>
>> When convert two sets with the same elements to two lists, are the
>> lists always going to be the same (i.e., the elements in each list are
>> ordered the same)? Is it documented anywhere?
>
>
> "A set object is an unordered collection of distinct hashable objects".
> If you create a set from unequal objects with equal hashes, the iteration
> order may (should, will) depend on the insertion order as the first object
> added with a colliding hash will be at its 'natural position in the hash
> table while succeeding objects will be elsewhere.
>
> Python 3.3.0a3 (default, May  1 2012, 16:46:00)
 hash('a')
> -292766495615408879
 hash(-292766495615408879)
> -292766495615408879
 a = {'a', -292766495615408879}
 b = {-292766495615408879, 'a'}
 list(a)
> [-292766495615408879, 'a']
 list(b)
> ['a', -292766495615408879]

Thanks. This is what I'm looking for. I think that this should be
added to the python document as a manifestation (but nonnormalized) of
what "A set object is an unordered collection of distinct hashable
objects" means.

-- 
Regards,
Peng
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-03 Thread Terry Reedy

On 5/3/2012 8:36 PM, Peng Yu wrote:

Hi,

list(a_set)

When convert two sets with the same elements to two lists, are the
lists always going to be the same (i.e., the elements in each list are
ordered the same)? Is it documented anywhere?


"A set object is an unordered collection of distinct hashable objects".
If you create a set from unequal objects with equal hashes, the 
iteration order may (should, will) depend on the insertion order as the 
first object added with a colliding hash will be at its 'natural 
position in the hash table while succeeding objects will be elsewhere.


Python 3.3.0a3 (default, May  1 2012, 16:46:00)
>>> hash('a')
-292766495615408879
>>> hash(-292766495615408879)
-292766495615408879
>>> a = {'a', -292766495615408879}
>>> b = {-292766495615408879, 'a'}
>>> list(a)
[-292766495615408879, 'a']
>>> list(b)
['a', -292766495615408879]

--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-03 Thread Andrew Berg
On 5/3/2012 7:36 PM, Peng Yu wrote:
> When convert two sets with the same elements to two lists, are the
> lists always going to be the same (i.e., the elements in each list are
> ordered the same)? Is it documented anywhere?
Sets are by definition unordered, so depending on their order would not
be a good idea. If the order stays the same, it's at most an
implementation detail which may or may not be consistent across
versions, and will likely not be consistent across implementations.

-- 
CPython 3.3.0a3 | Windows NT 6.1.7601.17790
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-03 Thread Tim Chase
On 05/03/12 19:36, Peng Yu wrote:
> list(a_set)
> 
> When convert two sets with the same elements to two lists, are the
> lists always going to be the same (i.e., the elements in each list are
> ordered the same)? Is it documented anywhere?

Sets are defined as unordered which the documentation[1] confirms.
A simple test seems to show that on cPython2.6 on this box, sets
with 100 elements converted to lists without sorting seem to
compare as equal (i.e., cPython is sorting), but I don't see any
guarantee that this should hold for other implementations, so I'd
sort first to ensure the intended behavior.

-tkc

[1]
http://docs.python.org/library/stdtypes.html#set-types-set-frozenset
"""
Being an unordered collection, sets do not record element position
or order of insertion.
"""




-- 
http://mail.python.org/mailman/listinfo/python-list


Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-03 Thread Dan Stromberg
If you need the same ordering in two lists, you really should sort the
lists - though your comparison function need not be that traditional.  You
might be able to get away with not sorting sometimes, but on CPython
upgrades or using different Python interpreters (Pypy, Jython), it's almost
certain the ordering will be allowed to change.

But sorting in a loop is not generally a good thing - there's almost always
a better alternative.

On Thu, May 3, 2012 at 5:36 PM, Peng Yu  wrote:

> Hi,
>
> list(a_set)
>
> When convert two sets with the same elements to two lists, are the
> lists always going to be the same (i.e., the elements in each list are
> ordered the same)? Is it documented anywhere?
>
> --
> Regards,
> Peng
> --
> http://mail.python.org/mailman/listinfo/python-list
>
-- 
http://mail.python.org/mailman/listinfo/python-list