On 19.11.14 12:21, porphyry5 wrote: > So if I read this hash table stuff correctly any data item that one > wants to look up in an associative array generates its own address in > memory, by using (a constant number of bytes of?) its value as a > single binary number and modifying that with some arithmetic algorithm > to produce an address in the available range. That is a really sweet > concept, but it must be rather wasteful of memory, lots of unused > spaces within its range.
In awk, at least (and probably most later adopters, such as perl), the memory is malloc-ed, i.e. builds as needed, and array entries can be of any size within available memory. The fact that an array need not be dimensioned, that array entries are created when referenced, and can be deleted, means that the implementation is not actually a simple array. Hash buckets, in other areas I've encountered, have usually been implemented as linked lists, but that is not cast in stone. If there is only one element at the hashed address, as is needed in encryption hashes, then there's no list there to walk. I have not examined the code to check how the hash table is implemented. > I use ultra-cheap, under-powered laptops (currently acer c720, new > laptop for < $200) which has only 2gb memory, but for 200,000 odd > words that's 10K available per word, lots of available waste space. Associative arrays will certainly conserve memory. The cost of this is that hashing a word takes one iteration of a small loop per character, before using that to address the hash table. (of pointers to the array elements) They are still more efficient than most other methods. For examples on the use of them, googling for the "Effective AWK Programming" manual should allow you to download a copy. The book The AWK Programming Language" by the eponymous authors of the language is published by Addison Wesley. Its concise presentation of illuminating examples, and permuted index, facilitate rapid extraction of the knowledge desired, without faffing around in overly elaborate verbosity. Any good reference will warn that "if (word in array) ... " syntax is needed for testing membership, since "if (array[word] != 0) ..." will create the element before testing it. (Please don't ask me how I know that.) > I really appreciate you hammering home this point about associative > arrays, I always assumed, their being so convenient to use, that they > had to be a very performance-costly luxury. Who knew. You are welcome. The language came out in 1977, and I've found it very useful and quick to code, in three decades in IT. Its C-like syntax makes it less of a write-only language than some others, I find. It is though, interpreted, not compiled, so manually looping over long lists is slow. Hashing should sort that out. > But I will forthwith rewrite the awk program with associative arrays > and compare the timing for the two methods. Thank you very much for > the trouble you have gone to on this. You are most welcome. All I did was point out the fruit hanging on the tree. Seeing someone new productively making jam with it is thanks enough. Erik -- Unix isn't hard, it's just a lot. (Ascribed to one of its originators) -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.