Hi all, Apropos hash tables.
For some years now, we have generic lists in gnulib. "Generic" means that the programmer can switch the implementation easily, because he's programming against an abstract API that has a number of different implementations. Here is my draft for applying the same approach to unordered sets. The normal implementation would be a hash table, but there are important variations: - Is the key a pointer, or a memory block of arbitrary size? - Is the value stored, or implicit? - Does inserting a (key,value) pair make a copy of the key? - Does the table allow removals? Feedback is highly appreciated! Do you see some interesting use case that I have overlooked? There shall be three abstract types: * Unordered set * Mapping from an unordered set to a 'void *' value * Unordered multiset Unordered Set ============= There are two flavours, depending on the key type: either a pointer (no copying involved upon insertion), or a memory block of varying size (copied during insertion). 1) Pointers. Files: gl_set.[hc] The constructor takes an equals and hashcode method. Variants: a) array Files: gl_array_set.[hc] arrayhc with cache of hash code Files: gl_arrayhc_set.[hc] b) hash Files: gl_hash_set.[hc] Operation ARRAY HASH ARRAYHC gl_set_size O(1) O(1) gl_set_add O(n) O(1) gl_set_remove O(n) O(1) gl_set_search O(n) O(1) gl_set_iterator O(1) O(1) gl_set_iterator_next O(1) O(1) Here the search method returns a void *, with value (void*)-1 denoting "not found". Hmm, or should the search method better take a 'bool *' argument??? 2) Memory block of varying size Files: gl_blobset.[hc] It has the same API, with different parameters (address and size of blob). The constructor uses a vtable parameter that specifies the allocator: a) for a set that allows removals: malloc-based b) for a set with no removals (hoarding): obstack-based Mapping from an unordered set to a 'void *' value ================================================= Likewise, it depends on the key type. 1) Pointers. Files: gl_map.[hc] The constructor takes an equals and hashcode method and a free method for the values. Variants: as above. a) array Files: gl_array_map.[hc] arrayhc with cache of hash code Files: gl_arrayhc_map.[hc] b) hash Files: gl_hash_map.[hc] Here the search method returns a void**, pointer to the value cell, or NULL if not found. 2) Memory block of varying size Files: gl_blobmap.[hc] It has the same API, with different parameters (address and size of blob). The constructor uses a vtable parameter that specifies the allocator: a) for a set that allows removals: malloc-based b) for a set with no removals (hoarding): obstack-based Unordered multiset ================== They are implemented as a mapping from an unordered set to an 'uintptr_t' value. Bruno