On Fri, 2006-12-01 at 11:58 -0800, Chris Hengge wrote:
> Ok, well... I think people lost the scope of my question.. I'm happy
> using the first method that was given to my post, I have stated in two
> emails that the order doesn't matter.. 
> 
> What I asked was why the order was changed, or more directly, what is
> the command actually doing to my data? I'm sure the order isn't
> totally random, but based on how the items are checked and dropped. 
> 
> The reason I care is because I'm just nosey like that and what to know
> what it is doing differently then the method I mentioned in the start
> of this thread. 
Your original method stepped through list1 and tested each element for
existence in list2.  Since you stated that the proportion of duplicates
is low, most searches of list2 are unsuccessful and require checking
every element of list2 from beginning to end.  For long lists the search
time adds up.

sets and dictionaries are hash based.  The location in the collection is
based on a hash value.  A check for membership computes the hash which
is used to pinpoint the location in the collection.  There are issues
where different elements have identical hashes, but those are handled
efficiently.  So searching for matches is dramatically faster than
stepping through a long list.

I believe the ordering within sets and dictionaries happens to match the
hash ordering.  
        for x in aset:
                print hash(x)
comes out in ascending order on the 3 sets I checked.  That's not really
a surprise.

In terms of correct programs, the order of set and dictionary items
should be viewed as arbitrary.

> 
> Never did I question the validity of the answer the first reply gave
> me, it works for what I need, not only that, it works well for what I
> need. I never put any stipulation on the order of the final data, so I
> didn't expect an answer that was order related. 
> 
> On 12/1/06, Tor Hildrum <[EMAIL PROTECTED]> wrote:
>         On 11/30/06, John Fouhy <[EMAIL PROTECTED]> wrote:
>         
>         > For the same reason that dictionaries don't preserve order.
>         > Basically, sets are (I think) implemented using a hash
>         table.  You can 
>         > read about hash tables on wikipedia (or many other places),
>         but one of
>         > the components of a hash table is a function mapping keys to
>         integers
>         > in a particular range.
>         
>         Why not just call a sigar for a sigar. 
>         
>         A set is a set, it may be implemented using a hash or it may
>         be
>         implemed using some other datastructure. It could be
>         implemented using
>         lists which preserves order, all though that doesn't make much
>         sense.
>         How it is implemented does not really matter here.
>         
>         http://en.wikipedia.org/wiki/Set
>         
>         If you want a collection of ordered objects, you don't want a
>         set. Not
>         even if the current implementation of sets in Python did
>         preserve 
>         order. Doing so could not be considered as anything else than
>         a ugly
>         hack or exploitation of the current implementation. And would
>         be
>         likely to break in the future.
>         
>         Tor
>         _______________________________________________ 
>         Tutor maillist  -  Tutor@python.org
>         http://mail.python.org/mailman/listinfo/tutor
> 
> _______________________________________________
> Tutor maillist  -  Tutor@python.org
> http://mail.python.org/mailman/listinfo/tutor
-- 
Lloyd Kvam
Venix Corp

_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to