Is there a clever (i.e., cheap) approach for searching a group of sets for a "test" set? I suppose another way of asking this question is whether there is an Icon-like way of hashing sets (by using hashing that is built into the implementation, avoiding brute-force explicit computation of a hash in Icon code)?
This is the essence a subproblem of a problem that I am working on (the problem happens to be inventorying versions of files on many computers, many of them similar). Here is the long description of this subproblem: I want to maintain a group of properties related to instances of an entity. Many instances share exactly the same group of property values. There may be many instances of the entity, and the number of attributes is not known a priori. It may be (and frequently is) that there is a different number of properties applicable to different instances. Because there are so many instances, it is too expensive to maintain a list of all property values for each instance. Each property value consists of a property identifier and three values, which are grouped in an Icon record. (Unicon objects may be an acceptable alternative.) I collect the properties for a given instance of the entity in a non-deterministic order. What I would like to do is to save memory by associating a reference to a structure, so that each of the many instances with the same group of property values will only need to store a reference to the structure representing that group. To simplify the problem, I am assuming fewer than 256 attributes; however, I am assuming more than 256 groups. My thought was to "normalize" my data using an Icon set structure for each group of property value records. (I believe that a table structure could work just as well.) Each entity record would have a field referencing such a set. Each time that I learn a new property value for the instance represented by the entity record, I would go to a master list of sets, look for or construct a set consisting of the union of the newly acquired property value and the entity record's current set, and assign a reference to this other set to the entity record's properties field. "Look for" means constructing a set that is a union of the record's current set and the new property value and then testing each set in the master list for equivalence to the constructed set. (I am thinking of using the seteq procedure from sets.icn in the IPL to test for set equivalence.) The problem that I see with such an approach is that, if N were the maximum size of each set and M were the number of sets, then this search would be O(M*N). So, this approach doesn't scale well. Is there a clever alternative for searching a group of sets for a "test" set? It is not a requirement that the structure actually be an Icon set. One issue that confounds my problem is that since the properties may not become known in a particular order, I cannot see how to use a finite state machine (e.g., construct a tree that I can walk until I find the node that I am looking for or find that node to be missing and add it). Any ideas? ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Unicon-group mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/unicon-group
