Re: [Tutor] A couple of somewhat esoteric questions

2014-10-22 Thread Dave Angel
Dave Angel  Wrote
 in message:
> "Clayton Kirkwood"  Wrote in message:
>> 
>> 
>> !-Original Message-
>> !From: Tutor [mailto:tutor-bounces+crk=godblessthe...@python.org] On
>> !Behalf Of Steven D'Aprano
> ...
>> 
>> For clarification, a key only has one value which can be changed. 
> 
> No, because the key has to be immutable, like a string or an int.

My error for reading your statement wrong.

> 
>> Multiple
>> keys can have the same value (hopefully not to the same memory location,
>> because one would usually not want the change of one key's value to alter
>> another key's value. As to the hash table, yes, I agree that this would be
>> one to many, hence the chains.
> 
> Memory locations are an implementation detail. We're talking in
>  this paragraph about keys and values. A value is an object, the
>  dict provides a one-way mapping from immutable key to an
>  arbitrary value object. If the value objects happen to be
>  immutable,  it makes no difference if a program uses the same
>  value for multiple keys. If the values are not immutable,  python
>  makes no assurance whether two values are equal, or identical. If
>  the latter is undesirable for a particular use, it's up to the
>  application logic to prevent it. It's probably more often useful
>  than problematic. Python makes no such assurances.
>  
> 
> 
> 
> 
>> !Can you give a concrete example of what you're trying to do?
>> 
>> The changing of the order is necessary when you want numeric indexing, say
>> for columns or rows.
>> 
> 
> No idea how this relates to dicts, which have no order, nor rows
>  nor columns. Make it concrete.
> 
> 
> -- 
> DaveA
> 
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
> 
> 


-- 
DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] A couple of somewhat esoteric questions

2014-10-22 Thread Dave Angel
"Clayton Kirkwood"  Wrote in message:
> 
> 
> !-Original Message-
> !From: Tutor [mailto:tutor-bounces+crk=godblessthe...@python.org] On
> !Behalf Of Steven D'Aprano
...
> 
> For clarification, a key only has one value which can be changed. 

No, because the key has to be immutable, like a string or an int.

> Multiple
> keys can have the same value (hopefully not to the same memory location,
> because one would usually not want the change of one key's value to alter
> another key's value. As to the hash table, yes, I agree that this would be
> one to many, hence the chains.

Memory locations are an implementation detail. We're talking in
 this paragraph about keys and values. A value is an object, the
 dict provides a one-way mapping from immutable key to an
 arbitrary value object. If the value objects happen to be
 immutable,  it makes no difference if a program uses the same
 value for multiple keys. If the values are not immutable,  python
 makes no assurance whether two values are equal, or identical. If
 the latter is undesirable for a particular use, it's up to the
 application logic to prevent it. It's probably more often useful
 than problematic. Python makes no such assurances.
 




> !Can you give a concrete example of what you're trying to do?
> 
> The changing of the order is necessary when you want numeric indexing, say
> for columns or rows.
> 

No idea how this relates to dicts, which have no order, nor rows
 nor columns. Make it concrete.


-- 
DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] A couple of somewhat esoteric questions

2014-10-22 Thread Clayton Kirkwood


!-Original Message-
!From: Tutor [mailto:tutor-bounces+crk=godblessthe...@python.org] On
!Behalf Of Steven D'Aprano
!Sent: Wednesday, October 22, 2014 4:40 AM
!To: tutor@python.org
!Subject: Re: [Tutor] A couple of somewhat esoteric questions
!
!On Tue, Oct 21, 2014 at 04:54:33PM -0700, Clayton Kirkwood wrote:
!
!> As I've contemplated the usage of dictionaries, I face the question of
!> efficiency. Going back before most of you were probably born;<)) if I
!> remember correctly dictionaries(assoc. arrays), having hashes, are
!> efficient for storing sparse arrays with the added benefit of hiding
!> the traversal of the dictionary looking for the proper key, and being
!> much less efficient if looking for the value and trying to retrieve
!> its key. But it also depends on the hash key and algorithm and how
!many spaces are "available" for filling.
!> If a hash is created for a small array, which is known ahead of time
!> with some possible hints, the array is efficient but has the distinct
!> difficulty of use when more and more pairs are created, because either
!> the new pair matches an already used hash point and a trail must be
!> descended off of that particular hash point or a mechanism must be
!> used to determine how the new pair should be stored elsewhere in the
!> array like at the end and matching becomes time consuming.
!
!This is standard for hash tables. Python, if I recall correctly, uses
!linear addressing rather than chaining.
!
!Python's hash function can return approximately 4294967296 different
!values (2**32), which means that for random or arbitrary keys, you won't
!expect more than a tiny handful of collisions. Of course a hostile
!adversary might, sometimes, be able to feed you keys that all hash the
!same, but in practice that's very unlikely to happen. And even if it
!does happen, the worst that will occur is that dict look-ups will
!degenerate to being proportional to the number of items, which isn't too
!bad.


I would hope that the full array is not created?


!
!> And/or, a new hash key and perhaps a different algorithm must be used
!> for creating the larger array.
!
!Python always uses the same hash algorithm. It doesn't try to change
!algorithms as the table gets bigger, it just scales the hash value to
!the size of the table. As the table gets full, the table is doubled in
!size to make room.

Out of curiousity, does a new array get built in a larger home and
everything redistributed, or does more memory get added on? I think the
second would be difficult because a new seed and size value would be needed
to properly distribute.

!
!(To be pedantic, I'm talking here about the standard CPython reference
!implementation. There are other implementations, such as Jython and
!IronPython and PyPy, which may do other things.)
!
!
!> Also, since the array
!> allows for multiple keys pointing to possibly identical values, the
!> array is not necessarily 1 to 1 (could be 1 to multiple although we
!> use replacement of the value normally) and definitely cannot be onto:
!> there is no rule that says the mapping from value back to key is
!singular and unique.
!
!Yes, but I don't see your point. Hash tables are designed to be one-to-
!many.

For clarification, a key only has one value which can be changed. Multiple
keys can have the same value (hopefully not to the same memory location,
because one would usually not want the change of one key's value to alter
another key's value. As to the hash table, yes, I agree that this would be
one to many, hence the chains.

!
!If you want to understand the *precise* details of how Python dicts
!work, you will need to read the source code written in C. But as an over
!simplification:
!
!- when you create an empty dict, it is created with a certain number
!  of free slots in a table;
!- as you add items to the dict, the dict knows how many free slots
!  remain;
!- when the table is half full, the dict resizes; very small dicts
!  are increased to four times bigger, while larger dicts are
!  doubled in size;
!- that means that *on average* adding new items to a dict takes
!  the same amount of time (amortized over the lifetime of the
!  dict).
!
!
!> I've not had much use of dictionaries in the past so maybe I am
!> missing something. As I contemplated dictionary use, it seems changing
!> the order of entries is best left to double single arrays - which can
!> be modified together when changing order. Dicts don't seem to support
!> moving pairs around and in fact would be broken if changes were
!attempted.
!
!I don't understand what you are trying to say here. Hash tables are
!unordered, and "changing the order of entries" makes no sense for them.
!Can you give a concrete example of what you're trying to do?

The changing of the order is necessary when you want numeri

Re: [Tutor] A couple of somewhat esoteric questions

2014-10-22 Thread Steven D'Aprano
On Tue, Oct 21, 2014 at 04:54:33PM -0700, Clayton Kirkwood wrote:

> As I've contemplated the usage of dictionaries, I face the question of
> efficiency. Going back before most of you were probably born;<)) if I
> remember correctly dictionaries(assoc. arrays), having hashes, are efficient
> for storing sparse arrays with the added benefit of hiding the traversal of
> the dictionary looking for the proper key, and being much less efficient if
> looking for the value and trying to retrieve its key. But it also depends on
> the hash key and algorithm and how many spaces are "available" for filling.
> If a hash is created for a small array, which is known ahead of time with
> some possible hints, the array is efficient but has the distinct difficulty
> of use when more and more pairs are created, because either the new pair
> matches an already used hash point and a trail must be descended off of that
> particular hash point or a mechanism must be used to determine how the new
> pair should be stored elsewhere in the array like at the end and matching
> becomes time consuming. 

This is standard for hash tables. Python, if I recall correctly, uses 
linear addressing rather than chaining.

Python's hash function can return approximately 4294967296 different 
values (2**32), which means that for random or arbitrary keys, you won't 
expect more than a tiny handful of collisions. Of course a hostile 
adversary might, sometimes, be able to feed you keys that all hash the 
same, but in practice that's very unlikely to happen. And even if it 
does happen, the worst that will occur is that dict look-ups will 
degenerate to being proportional to the number of items, which isn't too 
bad.

> And/or, a new hash key and perhaps a different
> algorithm must be used for creating the larger array. 

Python always uses the same hash algorithm. It doesn't try to change 
algorithms as the table gets bigger, it just scales the hash value to 
the size of the table. As the table gets full, the table is doubled in 
size to make room.

(To be pedantic, I'm talking here about the standard CPython reference 
implementation. There are other implementations, such as Jython and 
IronPython and PyPy, which may do other things.)


> Also, since the array
> allows for multiple keys pointing to possibly identical values, the array is
> not necessarily 1 to 1 (could be 1 to multiple although we use replacement
> of the value normally) and definitely cannot be onto: there is no rule that
> says the mapping from value back to key is singular and unique.

Yes, but I don't see your point. Hash tables are designed to be 
one-to-many.

If you want to understand the *precise* details of how Python dicts 
work, you will need to read the source code written in C. But as an over 
simplification:

- when you create an empty dict, it is created with a certain number 
  of free slots in a table;
- as you add items to the dict, the dict knows how many free slots
  remain;
- when the table is half full, the dict resizes; very small dicts
  are increased to four times bigger, while larger dicts are 
  doubled in size;
- that means that *on average* adding new items to a dict takes
  the same amount of time (amortized over the lifetime of the
  dict).


> I've not had much use of dictionaries in the past so maybe I am missing
> something. As I contemplated dictionary use, it seems changing the order of
> entries is best left to double single arrays - which can be modified
> together when changing order. Dicts don't seem to support moving pairs
> around and in fact would be broken if changes were attempted.

I don't understand what you are trying to say here. Hash tables are 
unordered, and "changing the order of entries" makes no sense for them. 
Can you give a concrete example of what you're trying to do?


> Kind of like
> using a real dictionary, you can have a word (key) have multiple meanings
> (values),

In Python, the way to do that is by associating a list of meanings with 
the one key:

d = {'key': ['first', 'second', 'third'],
 'another key': ['alpha', 'beta']}


> and you can have multiple words (keys) having the same meaning
> (value). Again, with the difference from a real dictionary, our dicts can't
> have keys pointing to different values. 

They certainly can.


> But there are definitely uses for dicts.

One or two, one or two. Hundred. Thousand.

Seriously, dicts are probably the most commonly used data structure in 
Python, by far. They are used for modules, classes, instances, global 
variables are stored inside a dict, built-in functions are stored inside 
a dict, modules are cached inside a dict. They are efficient and fast 
and highly tuned, and there is almost never any need to try to re-invent 
the wheel.


-- 
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] A couple of somewhat esoteric questions

2014-10-22 Thread Alan Gauld

On 22/10/14 00:54, Clayton Kirkwood wrote:

As I’ve contemplated the usage of dictionaries, I face the question of
efficiency.


Don;t worry about it. Python dictionaries are highly efficient and used 
throughout the language. namespaces and classes are both effectively 
dictionaries so every time you use a variable name or class attribute 
(whether your own or a built-in) you are using a dictionary.



remember correctly dictionaries(assoc. arrays), having hashes, are
efficient for storing sparse arrays


They are sparce arrays using hashing for lookup.


traversal of the dictionary looking for the proper key, and being much
less efficient if looking for the value and trying to retrieve its key.


Indeed, that's why dictionaries don't have a get key for
value operation.


But it also depends on the hash key and algorithm and how many spaces
are “available” for filling.


Yes, and Python's hashing has been tuned and optimised over many years.


I’ve not had much use of dictionaries in the past so maybe I am missing
something.


You are using them all the time. Python is built on dictionaries.
They are very powerful and efficient.


As I contemplated dictionary use, it seems changing the order
of entries is best left to double single arrays


There is no order. The order may change in use and you have
no control over that. It's a Python internal detail and you as a 
programmer cannot mess with it.



--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.flickr.com/photos/alangauldphotos

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] A couple of somewhat esoteric questions

2014-10-22 Thread Clayton Kirkwood
As I've contemplated the usage of dictionaries, I face the question of
efficiency. Going back before most of you were probably born;<)) if I
remember correctly dictionaries(assoc. arrays), having hashes, are efficient
for storing sparse arrays with the added benefit of hiding the traversal of
the dictionary looking for the proper key, and being much less efficient if
looking for the value and trying to retrieve its key. But it also depends on
the hash key and algorithm and how many spaces are "available" for filling.
If a hash is created for a small array, which is known ahead of time with
some possible hints, the array is efficient but has the distinct difficulty
of use when more and more pairs are created, because either the new pair
matches an already used hash point and a trail must be descended off of that
particular hash point or a mechanism must be used to determine how the new
pair should be stored elsewhere in the array like at the end and matching
becomes time consuming. And/or, a new hash key and perhaps a different
algorithm must be used for creating the larger array. Also, since the array
allows for multiple keys pointing to possibly identical values, the array is
not necessarily 1 to 1 (could be 1 to multiple although we use replacement
of the value normally) and definitely cannot be onto: there is no rule that
says the mapping from value back to key is singular and unique.

 

I've not had much use of dictionaries in the past so maybe I am missing
something. As I contemplated dictionary use, it seems changing the order of
entries is best left to double single arrays - which can be modified
together when changing order. Dicts don't seem to support moving pairs
around and in fact would be broken if changes were attempted. Kind of like
using a real dictionary, you can have a word (key) have multiple meanings
(values), and you can have multiple words (keys) having the same meaning
(value). Again, with the difference from a real dictionary, our dicts can't
have keys pointing to different values. But there are definitely uses for
dicts.

 

Seem right???

 

Clayton

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor