>"Bill Baxter" <wbax...@gmail.com> wrote in message 
>news:mailman.34.1264542189.4461.digitalmars-d-le...@puremagic.com...
>On Tue, Jan 26, 2010 at 1:21 PM, bearophile <bearophileh...@lycos.com> 
>wrote:
>> Nick Sabalausky:
>>> Aside from that being how Python does it, why do you see that as 
>>> preferable?
>>
>> Because:
>> 1) linear searches in an array are damn common. I don't remember the 
>> results of my benchmarks, but until your integer arrays is quite longer 
>> than 30-50 items, performing a linear search is faster than a lookup in 
>> an AA, on DMD. On Tango this number is probably 70% higher
>> 1b) In Python if you perform a "foo" in "barfoo" the language doesn't 
>> perform a linear search, it uses a much smarter search that has a 
>> complexity lower than the product of the two lengths, using a custom 
>> algorithm. So in D you can use the same syntax to search for 
>> substrings/subarrays. Where such smarter search is not possible, D can 
>> use a naive search.
>> 2) It's really handy. I use isIn(item, items) to search on arrays in D, 
>> but having a item in items is nicer.
>> 3) You can use the same syntax to search into anything that's lazily 
>> iterable too (a Range). This is very handy.
>>
>>
>>> So having a single syntax work on the outputs for
>>> regular arrays, but then on the inputs for AAs, seems highly 
>>> inconsistent
>>> and error-prone to me.
>>
>> I have followed many Python newbies personally, I am following the Python 
>> newsgroups, and I have programmed for years in Python, and while I have 
>> seen many different kinds of bugs, I have not seen a significant amount 
>> of bugs in this. Python programmers just learn that dicts and lists are a 
>> little different in this regard. At the same way they learn that a set 
>> and a dict are different data structures, with different capabilities and 
>> usages.
>
>It's not even really  inconsistent if you just think about these data
>structures in terms of function rather than form.
>An array is often used as a simple set of things.  "O in Array" means
>"is O in that set of things"
>An AA is a set of things that also have some associated data.  "O in
>AA" means "is O in that set of things" (not the ancillary data)
>If you have an actual "set" data structure for containing a set of of
>things, then "O in Set" means, again, "is O in that set of things".
>(In fact the closest thing D has to a built-in set type is an AA with
>"don't care" associated data, reinforcing the notion of AA as a set
>plus extra data.)
>
>

Even looking at function rather than form, I still think its innacurate to 
consider the keys to be the elements of an AA. In most uses of an AA, the 
key is primarily something convenient with which to look up data. They hold 
significance, but typically not as much as the data that is looked up with 
it. What you've described is very much like (and quite literally the same 
as, in the case of many dynamic languages) thinking of a variable's name as 
the actual data, and thinking of the value it holds merely as "ancillary 
data".

Keep in mind too, even with a regular array, the index can still hold 
significance as data. For instace, I could have an array of Foo's, declare 
that any element with an odd index has property 'A' and any with an even 
index has property 'B', and treat them as such. May seem strange at a 
glance, but such things are common in low-level, low-resoruce and 
performance-oriented code. Bottom line, though, is that "Property 'A' or 
'B'" is data that now been encoded in the array's index, but despite that, 
the indicies still aren't considered the array's elements. And the data they 
lookup still isn't considered "ancillary data".

And yes, a Hashed Set may likely be *implemented* as an AA with just keys, 
but that's just form, it doesn't imply a similarity in regard to function. 
The *function* of a HashSet is that of an unordered array that's been 
optimized for "contains X / doesn't contain X" on large data sets.

Containers and their functions:
- AA: Store A's with label B, A is fairly important, B may or may not be.
- Array: Store A's with label B, A is fairly important, B may or may not be, 
B has certain restrictions, container overall has different performance 
characteristacs from an AA.
- Hashed Set: Store A's.


Reply via email to