[boost] Re: Interest in multiindex_set?(again)
On Monday, July 14, 2003, at 9:12 PM, Joaquín M López Muñoz wrote: Well, to sum it up, these are the public names of the library: 0 multiindex_set (haven't decided on the final name, still doubting between indexed_set and indexed_table). 1 swap, equality and comparison operators 2 index_type 3 get (function to retrieve refs to an index of a given multiindex_set) 4 iterator_type 5 const_iterator_type 6 project (function to convert between different indices's iterators) 7 index_list (a type list for specifying the indices composing a multiindex_set) 8 unique and non_unique (index traits classes that go into the index_list) get() and project() are sufficiently well specified that they won't clash with similar overloads belonging to namespace boost. But the rest of names, which are classes instead of functions, are problematic: their names are too general to be safely put into namespace boost without risk of collision. So I guess some namespace have to be set up for them. Besides, there's the problem of the yet to be written member<> utility (that you're already familiar with). This clearly is a very general utility that has no particular ties to multiindex_set, so I don't know where it should go. So the two problems are: * Where to put 2,4,5,7, and 8? For known reasons, boost::multiindex_set is not a good candidate (problems with having a class and a namespace named the same). * Where to put member<>? 2. 4. 5. 7. Couldn't these be internal typedef's in the multi-index set class? (If you meant the original definitions, I don't know since I don't have a copy of your code right now.) 8. Could go with the original definition of [7]. member: Could go with the other functional stuff in Boost, but I don't think there is a unified domain for that stuff right now. The functional stuff is spread all over the top-level namespace, except for Boost.Lambda (which uses its own exclusive sub-namespaces). Daryle ___ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Re: [boost] Re: Interest in multiindex_set?(again)
Daryle Walker ha escrito: > On Saturday, July 12, 2003, at 9:21 PM, Joaquín M López Muñoz wrote: > > > Hi again, > > > > - Mensaje Original - > > De: Fernando Cacciola <[EMAIL PROTECTED]> > > Fecha: Sábado, Julio 12, 2003 7:32 pm > > Asunto: [boost] Re: Re: Interest in multiindex_set?(again) > > > > [stuff about conceptual structure of multtindex_set deleted] > > > > OK, I'm glad we finally got to understand each other :) There's a > > problem with the name of the class. Others have expressed dislike for > > "multtindex_set". Alternative candidates are "indexed_set" and > > "indexed_table". I haven't decided yet for one, plus there's the > > problem of which namespace should this live in (regardless of whether > > it is promoted to namespace boost later). The alternatives so far are > > (name of the class/associated namespace) > > > > * multiindex_set/boost::multiindex > > * indexed_set/?? > > * indexed_table/?? > > * ??/boost::container (proposed by Daryle) > > > > boost::container I don't like because some of the associated small > > utility classes and functions (less_by, get, project) shouldn't really > > belong into a general-purpose namespace like container which is > > supposed to hold other contributions. Also, there's the additional > > problem that the class and the namespace shouldn't be named the same > > (it makes some compilers choke, this has been discussed in connection > > with Boost.Tuple). Suggestions in this area are most welcome. > [TRUNCATE] > > If the small utility classes are sufficiently independent from your > main classes, then put them in separate (possibly unrelated) > namespaces. I don't we've ever reviewed a multi-domain package, > though. Or we can review the utility parts separately, first. > > Daryle Well, to sum it up, these are the public names of the library: 0 multiindex_set (haven't decided on the final name, still doubting between indexed_set and indexed_table). 1 swap, equality and comparison operators 2 index_type 3 get (function to retrieve refs to an index of a given multiindex_set) 4 iterator_type 5 const_iterator_type 6 project (function to convert between different indices's iterators) 7 index_list (a type list for specifying the indices composing a multiindex_set) 8 unique and non_unique (index traits classes that go into the index_list) get() and project() are sufficiently well specified that they won't clash with similar overloads belonging to namespace boost. But the rest of names, which are classes instead of functions, are problematic: their names are too general to be safely put into namespace boost without risk of collision. So I guess some namespace have to be set up for them. Besides, there's the problem of the yet to be written member<> utility (that you're already familiar with). This clearly is a very general utility that has no particular ties to multiindex_set, so I don't know where it should go. So the two problems are: * Where to put 2,4,5,7, and 8? For known reasons, boost::multiindex_set is not a good candidate (problems with having a class and a namespace named the same). * Where to put member<>? Joaquín M López Muñoz Telefónica, Investigación y Desarrollo > > > ___ > Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost ___ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
[boost] Re: Interest in multiindex_set?(again)
On Saturday, July 12, 2003, at 9:21 PM, Joaquín M López Muñoz wrote: Hi again, - Mensaje Original - De: Fernando Cacciola <[EMAIL PROTECTED]> Fecha: Sábado, Julio 12, 2003 7:32 pm Asunto: [boost] Re: Re: Interest in multiindex_set?(again) [stuff about conceptual structure of multtindex_set deleted] OK, I'm glad we finally got to understand each other :) There's a problem with the name of the class. Others have expressed dislike for "multtindex_set". Alternative candidates are "indexed_set" and "indexed_table". I haven't decided yet for one, plus there's the problem of which namespace should this live in (regardless of whether it is promoted to namespace boost later). The alternatives so far are (name of the class/associated namespace) * multiindex_set/boost::multiindex * indexed_set/?? * indexed_table/?? * ??/boost::container (proposed by Daryle) boost::container I don't like because some of the associated small utility classes and functions (less_by, get, project) shouldn't really belong into a general-purpose namespace like container which is supposed to hold other contributions. Also, there's the additional problem that the class and the namespace shouldn't be named the same (it makes some compilers choke, this has been discussed in connection with Boost.Tuple). Suggestions in this area are most welcome. [TRUNCATE] If the small utility classes are sufficiently independent from your main classes, then put them in separate (possibly unrelated) namespaces. I don't we've ever reviewed a multi-domain package, though. Or we can review the utility parts separately, first. Daryle ___ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
[boost] Re: Interest in multiindex_set?(again)
Fernando Cacciola wrote: JOAQUIN LOPEZ MU?Z wrote: [...] Another thing I couldn't figure out is how to compose indices hierachically.That is, how to reproduce a typical SQL SELECT * ORDER BY X,Y,Z. Since you're modeling an indexed table, this functionality should be supported. This is an interesting subject to study, but as some others are already writing a relational framework I guess we should limit here to the basics of the container. SQL-like predicates can be built on top of it in the context of more ambitious libraries. I see. If this functionality can be built on top of it, then I think it is OK not to support it explicitely. If the library supports key extraction, for example, the the key extractor could simply return a tuple of references to e.g. the X, Y, and Z members of the value. With the default std::less comparison predicate, this would provide the desired lexigraphical ordering. [...] The most usual situation for specification of an index (standard less_than comparison on some member of the element) is already covered by less_by. Furthermore, less_by accepts an additional template parameter for specifying a comparison predicate other than std::less. For instance less_by specifies an index based on less_than comparison of employee::name (a string). If you want to use some other comparison criterium on empoyee::name (for instance, case insensitive comparison), you can write less_by Is this what you're referring to? Half of it. I see that I can use the precanned 'less-by' and specify a different comparison semantic (though in this case why is it named 'less_by' instead of 'order_by'), but I don't see how can I use it with something different than a data member as the key itself. On some applications, specially outside databases, the keys are given by runtime expressions and not just stored data members. If the users would have to end up writting their own predicates, the separaton will make that task a lot easier. Effectively, less_by to some extent *is* separating key extraction from comparison. It would be implemented to a greater extent if the additional functor were provided that was a more generic version of less_by, where the first argument is the key extractor functor. However, there are certain advantages, as I have discussed in previous posts, to integrating this separation into the table class directly. [...] If you want restoring-on-collision, use update(). To sum it up: * update() does restore-on-collision, with the overhead of an element copy. * modify() does not incur an element copy, but on collision the modified element is erased. I don't think we have other options here. I've tried to model modify() following your request and those of others, and the approach is IMHO forced by design. Anyway, I'm open to new ideas. OK, this is the best we could think of, at least right now. Many times the user knows exactly what she's doing so that no collisions will ever really ocurr. For example, if the modify does not change keys but other data. For this cases modify() is a big plus. The acutal user code could assert the result of modify() so that a coallision is treated as postcondition violation. > > > The documentation could express emphatically that if the modification > changed keys, update() must be used, else, modify() can. If the keys are not modified, a modify method is not needed. The idea of the modify method is to fix ordering as needed as a result of changing a key. Additionally, versions of modify could be provided that allow the user to specify, either by index number or a index tag (see below), in which indices reordering might be required, thus saving the cost of determining that in other indices, which the user knows will not be modified, no reordering is required. This functionality is important for efficiency. Ideally, a convenient syntax could be devised for specifying that no reordering will be required as a result of a modify operation. The problem is that it seems that, for instance: modify(it, functor); // should mean reorder in all indices modify<3>(it, functor); // reorder only 3 modify(it, functor); // reorder only indices with TagType tag Possibly, the way to do that would be to specify a tag-type that is not used by any index. E.g. if void is unused, then: modify(it, functor); // no reordering required Instead of this syntax, the alternative is to make the iterators non-constant. This is inconsistent with standard library associative containers (although this is not really an associative container); it does, however, allow for the use of standard algorithms that require non-constant iterators (but which will not modify the elements such that reordering would be necessary) without an adaptor. If non-constant iterators are not provided, it seems it would be useful to provide a const_casting iterator adaptor. A possibility is to protect modify() so that it can used but not accidentally. That is, we can prevent a
Re: [boost] Re: Interest in multiindex_set?(again)
Hi Fernando, - Mensaje Original - De: Fernando Cacciola <[EMAIL PROTECTED]> Fecha: Sábado, Julio 12, 2003 1:22 am Asunto: [boost] Re: Interest in multiindex_set?(again) > Hi Joaquín, > > Unfortunately, I douldn't compile the code with BCC because it > extensivelyuses non-type template parameters which are only > partially supported on > Borland. > > After having read the documentation and looked at the code again, > I realize > now that your class is really a database-like table but not an > associativecontainer. In fact, it is not even a container. > The problem is that you cannot traverse over _all_ the elements > linerly.(See 23.1, Table 65) Yes you can, see below. > I previously took for granted that it was a container so I asked > about the > _global_ arrangement of the elements. > I think that you shoud add iterators to traverse over the entire > table as if > it were a Sequence. This is the basic requirement of a container. > With dismay I realize how very poor my documentation skills are :) multiindex_set *is* a container in the sense you describe. Given a multiindex_set instantiation say my_mx_set witn N indices, each index provides a set-like interface with allows the user to traverse *all the elements* contained: my_mx_set mxs; ... my_mx_set::index_type::type& index_n= mxs.get(); // n between 0 and the N-1 Now, index_n.begin() and index_n.end() let you enumerate *all* the elements in mxs, regardless of n; the difference is in the order they are enumerated, which depends on the comparison predicate associated with index #n. By convention, my_mx_set inherits the functionailty of index #0, that is my_mx_set::some_memfun(...); is the same as my_mx_set::index_type<0>::type::some_memfun(...); In particular, my_mx_set::begin() and my_mx_set::end() let you traverse through then entire set of elements contained, with the order induced by index #0. Does this ask your question? If not, please let me know and I'll be happy to explain it further. I'm most determined to eliminate this semantic barrier between both of us :) Also, may I suggest you download some distribution of g++ and play with the library a little. Maybe then things become clearer. > Now, suppose that you can iterate over the all of the elements: > what's the > order in which elements will appear w.r.t the insertion sequence > and the > ordering implied by the indices? This is what I've asked you about the > "ordering and clustering invariants" of the data structure. > > If, during a linear traversal (that is, iterating over all of the > elementsas if it were a sequence), the elements will appear in an > unspecified order, > then the data structure is not an associative container (much less > a set), > so it should _definitely_ be called 'table' and not 'set'. > > I think I understand now why the term 'index'. It reseambles the > filtering-key associated with the 'field' which designates a > 'column' on a > database table. However, if I'm not mistaken, the filtered-column > itself,that is, the sequence of elements with such value in the > correspondingfield, is not an 'index'. So the subsquence which is > mapped to a given index > shouln't be called index, I think. I would call it a 'cluster' as I > suggested before. > > Since I can't play with the code I couldn't see how much of a set > is each > 'cluster'. Make sure to check with the requirements and document this > clearly. > Hopefully, this has been asked above. Again, please let me know otherwise so that I try to make myself clearer. > Another thing I couldn't figure out is how to compose indices > hierachically.That is, how to reproduce a typical SQL SELECT * > ORDER BY X,Y,Z. > Since you're modeling an indexed table, this functionality should be > supported. This is an interesting subject to study, but as some others are already writing a relational framework I guess we should limit here to the basics of the container. SQL-like predicates can be built on top of it in the context of more ambitious libraries. > > I vote for a separate KeyExtraction because: > > (a) Key extraction and key comparison are conceptually orthogonal. > > (b) The orthogonality will be important in practice when users > want to > compare whith other predicates besides less_than and/or based on > more than > just data members. > With your mixin approach, it would be more difficult to have an > index based > on a runtime expression > (such as a hash value) combined with a comparison other than <. The most usual situation for specification of an index (standard less_than comparison on some
[boost] Re: Interest in multiindex_set?(again)
Hi Joaquín, Unfortunately, I douldn't compile the code with BCC because it extensively uses non-type template parameters which are only partially supported on Borland. After having read the documentation and looked at the code again, I realize now that your class is really a database-like table but not an associative container. In fact, it is not even a container. The problem is that you cannot traverse over _all_ the elements linerly. (See 23.1, Table 65) I previously took for granted that it was a container so I asked about the _global_ arrangement of the elements. I think that you shoud add iterators to traverse over the entire table as if it were a Sequence. This is the basic requirement of a container. Now, suppose that you can iterate over the all of the elements: what's the order in which elements will appear w.r.t the insertion sequence and the ordering implied by the indices? This is what I've asked you about the "ordering and clustering invariants" of the data structure. If, during a linear traversal (that is, iterating over all of the elements as if it were a sequence), the elements will appear in an unspecified order, then the data structure is not an associative container (much less a set), so it should _definitely_ be called 'table' and not 'set'. I think I understand now why the term 'index'. It reseambles the filtering-key associated with the 'field' which designates a 'column' on a database table. However, if I'm not mistaken, the filtered-column itself, that is, the sequence of elements with such value in the corresponding field, is not an 'index'. So the subsquence which is mapped to a given index shouln't be called index, I think. I would call it a 'cluster' as I suggested before. Since I can't play with the code I couldn't see how much of a set is each 'cluster'. Make sure to check with the requirements and document this clearly. Another thing I couldn't figure out is how to compose indices hierachically. That is, how to reproduce a typical SQL SELECT * ORDER BY X,Y,Z. Since you're modeling an indexed table, this functionality should be supported. I vote for a separate KeyExtraction because: (a) Key extraction and key comparison are conceptually orthogonal. (b) The orthogonality will be important in practice when users want to compare whith other predicates besides less_than and/or based on more than just data members. With your mixin approach, it would be more difficult to have an index based on a runtime expression (such as a hash value) combined with a comparison other than <. Finally, I'm not entirely happy with the coallision response of 'modify' (or maybe I don't understand it). Is it ever possible to afford removing the colliding modified element? Imagine I change the Social Security Number of Jhon, and by mistake, I assign him the same number as Carla... the modify() will return 'false', but there won't be a Jhon anymore in the table Best, Fernando Cacciola ___ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Re: [boost] Re: Interest in multiindex_set?(again)
Jeremy Maitin-Shepard ha escrito: [...] > > find(key) and related search memfuns are already supported in the way > > you suggest! Look for "special lookup operations" in the documentation. > > Matter of fact, the operations provided are even more powerful in that > > they allow for alternative arguments and comparison predicates. > > It does not appear, however, that they support automatically detecting > the index based on the type of the key. This functionality seems > useful, although in general it is only a syntactic convenience.. > Specifically, if a table has two indices, one with a key type of int, > the other with key type of std::string, with this functionality, the > following syntax could be used: (note that the operator [] I show here > could not have the semantics of std::map; rather it would need to throw > an exception or have undefined behavior if an existing element is not found) No, I guess my design is very far away from having the ability to provide automatic index selection based on the arg type. Nevertheless, this ability wouldn't be so useful when you have severalmembers with the same type, which is expectedly not uncommon. Anyway, I underestand the lack of this feature is not a showstopper for you. > > > table["somekey"].whatever; > > The above statement would use the std::string index, because the type of > the argument (char const *) is convertible to std::string. > > table[3].whatever I did something similar in the past in a bidirectional map of mine, but here the stituation is far more complicated. Unless some brilliant mind comes here with an implementation approach, this feature won't go into the library. > > > The above statement would use the int index, because the type of the > argument is int. > > A similar syntax could be used for find, equal_range, count, erase. > Additionally, something like the following could be supported, where > table_type is the type of the table: > > table_type::index_with_key::iterator it = > table.find("something"); Same comment as above, clashes in index deduction are likely to occur for moderately coplex tables having indices over elements with the same type. Tagged indices, OTOH, are nice to have. Please read at the bottom of this post. > > > Note that you provide a version of equal_range that is not provided by > std::map or set, namely, a two-argument version, with the semantics of > returning the range between the two arguments. It is not possible to > implement this for a hash table index. Under a policy-based interface, > this could be supported by the "view" of an index that is a binary > search tree, while not being supported by the table class itself. This is an error in the docs :) I fixed the example a few days ago, please download the .zip again. Actually, equal_range has a standard signature, just like other assoc containers. > > > Similarly, finding using a weaker ordering predicate is only supported > by binary-search-tree-based indices, and thus it seems this should be > supported on a per-index basis only. > > Even if it is decided that automatically detecting the index based on > the key type is not useful, I think it is advantageous to decouple key > extraction from key comparison. > > >>Couldn't a `composite index' be supported by a key which is a tuple of > >>some number of references? > > > > > > I guess you're taking inspiration here from your own library's design. This > > implementation of composite indices simply doesn't fit here. You can consider > > things are "the other way around" with respect to indices and dependent keys: > > every index is composite. Those that happen to depend on a single member > > of the element can be conveniently built by resorting to less_by. I'm > > not sure I'm being clear enough. If not, please tell me so. > > > > On a side note, taking composite indices as indices over a tuple of > > members does not cover the full spectrum of what an index can do. Imagine an > > index whose associated comparison predicate depend on the result of an > > external function, say > > > > int f(const element& e); > > > > f() won't accept a tuple of members of element, but the actual element. To make > > it fit one would be forced to construct a new element on the fly out of the > > provided parts. > > Although you have not specified the semantics of f(), it seems that f() > is effectively a key extractor, and the key type would be int. In order > to use this with your current system of integrating key extractor with > key comparison, an additional functor would be needed that effectively > composed this function f() with some comparison predicate. In order to > allow the user to use an int value (rather than an element value) as a > key, this composition functor would also have to provide a number of > overloads some of which bypass this key extraction functor for one of > the arguments. For a hash-table-based index, it becomes even more > complicated, because a second "composition" fu
[boost] Re: Interest in multiindex_set?(again)
Joaquín Mª López Muñoz wrote: The semantics of std::map (or any other STL associative container) are not overwriting on insertion: instead, if an equivalent element is found, no insertion occurs. This is what my library does, too. Maybe you're confusing here map::operator[] with map::insert. Yes, I was confused. Your semantics are correct. There are various advantages to separating key extraction from key comparison: 1) It allows the use of more generic key extraction and key comparison functors. Specifically, for an ordered index, the key comparison function in many cases would be std::less, and can default to that. For "unordered" (hash-table-like) indices, it is necessary, at least internally, to separate key comparison and key extraction because the hash function must also operate on the "key" value; otherwise, the hash function must also be very specialized, when in many cases a default library-provided hash function can be used (and thus not specified). A small utility class named less_by is used for precisely this purpose. You can see it in action in the example accompanying the library. As for hashing, an analogous device can be written (when hashes get included). 2) It is necessary in order to support certain convenient interfaces; specifically, I think it is very convenient to allow the syntax: table[key], find(key), and equal_range(key). In order for the front-end to deteremine which index corresponds to the key type, it must "know" about keys. find(key) and related search memfuns are already supported in the way you suggest! Look for "special lookup operations" in the documentation. Matter of fact, the operations provided are even more powerful in that they allow for alternative arguments and comparison predicates. It does not appear, however, that they support automatically detecting the index based on the type of the key. This functionality seems useful, although in general it is only a syntactic convenience.. Specifically, if a table has two indices, one with a key type of int, the other with key type of std::string, with this functionality, the following syntax could be used: (note that the operator [] I show here could not have the semantics of std::map; rather it would need to throw an exception or have undefined behavior if an existing element is not found) table["somekey"].whatever; The above statement would use the std::string index, because the type of the argument (char const *) is convertible to std::string. table[3].whatever The above statement would use the int index, because the type of the argument is int. A similar syntax could be used for find, equal_range, count, erase. Additionally, something like the following could be supported, where table_type is the type of the table: table_type::index_with_key::iterator it = table.find("something"); Note that you provide a version of equal_range that is not provided by std::map or set, namely, a two-argument version, with the semantics of returning the range between the two arguments. It is not possible to implement this for a hash table index. Under a policy-based interface, this could be supported by the "view" of an index that is a binary search tree, while not being supported by the table class itself. Similarly, finding using a weaker ordering predicate is only supported by binary-search-tree-based indices, and thus it seems this should be supported on a per-index basis only. Even if it is decided that automatically detecting the index based on the key type is not useful, I think it is advantageous to decouple key extraction from key comparison. Couldn't a `composite index' be supported by a key which is a tuple of some number of references? I guess you're taking inspiration here from your own library's design. This implementation of composite indices simply doesn't fit here. You can consider things are "the other way around" with respect to indices and dependent keys: every index is composite. Those that happen to depend on a single member of the element can be conveniently built by resorting to less_by. I'm not sure I'm being clear enough. If not, please tell me so. On a side note, taking composite indices as indices over a tuple of members does not cover the full spectrum of what an index can do. Imagine an index whose associated comparison predicate depend on the result of an external function, say int f(const element& e); f() won't accept a tuple of members of element, but the actual element. To make it fit one would be forced to construct a new element on the fly out of the provided parts. Although you have not specified the semantics of f(), it seems that f() is effectively a key extractor, and the key type would be int. In order to use this with your current system of integrating key extractor with key comparison, an additional functor would be needed that effectively composed this function f() with some comparison predicate. In order to allow the user to use an int val
Re: [boost] Re: Interest in multiindex_set?(again)
Jeremy Maitin-Shepard ha escrito: > Joaquín Mª López Muñoz wrote: > > > > > This is a no-no policy. Collision can happen with more than one element. > > Following this approach could result in an single update sweeping off > > half the elements of the container :) I don't think users of the > library want > > this. > > What are the current semantics of your insert? In order to be > consistent with std::map, the behavior has to be to overwrite (delete) > an existing element with the same key. The number of elements deleted > by an insert or modify or update operation under the semantics I suggest > is at most the number of unique indices. These are the semantics I > would expect anyway, although I suppose it is a matter of opinion. The semantics of std::map (or any other STL associative container) are not overwriting on insertion: instead, if an equivalent element is found, no insertion occurs. This is what my library does, too. Maybe you're confusing here map::operator[] with map::insert. > > > > I think the analysis is not precise enough. In the modify approach you're > > proposing, modifying (by way of the user-provided functor) takes > place *before > > anything*. If this functor throws, we got nothing else to do but let > the exception > > > > propagate. It is the responsibility of the user to guarantee that the > modyfing > > functor does not leave things in a bad state. > > An acceptable (IMHO) approach for modify would be as follows: > > > > a. Call the user-provided modifying functor. If it throws, let the > exception > > propagate. > > Okay, now that I reconsider, this seems most consistent with standard > containers, since it follows the idiom of "exception thrown means > nothing happened." > > > b. Attempt reordering. > > b1. If there's a collision, delete the modified element (this can > be done in > > a no-throw way) and return false > > I still think it is better to delete the older elements rather than the > new one, consistent with my proposed semantics for insert (and > consistent with standard container insert). same comment about insertion semantics as above. > > > > b2. If there's an exception (by some comparison predicate), > delete the > > modified > > element and rethrow. > > Okay, this seems reasonable, and would be consistent with insert. So, to sum it up, if we both agree that insertion will not overwrite (as it is the semantics of STL assoc. containers), then is this approach for modify OK with you? I've included the new memfun modify following your indications and those of Fernando, and it works right. It'll go in next release, which I hope it'll be submitted for preformal review. [...] > > There are various advantages to separating key extraction from key > comparison: > > 1) It allows the use of more generic key extraction and key comparison > functors. Specifically, for an ordered index, the key comparison > function in many cases would be std::less, and can default to > that. For "unordered" (hash-table-like) indices, it is necessary, at > least internally, to separate key comparison and key extraction because > the hash function must also operate on the "key" value; otherwise, the > hash function must also be very specialized, when in many cases a > default library-provided hash function can be used (and thus not specified). A small utility class named less_by is used for precisely this purpose. You can see it in action in the example accompanying the library. As for hashing, an analogous device can be written (when hashes get included). > > > 2) It is necessary in order to support certain convenient interfaces; > specifically, I think it is very convenient to allow the syntax: > table[key], find(key), and equal_range(key). In order for the front-end > to deteremine which index corresponds to the key type, it must "know" > about keys. find(key) and related search memfuns are already supported in the way you suggest! Look for "special lookup operations" in the documentation. Matter of fact, the operations provided are even more powerful in that they allow for alternative arguments and comparison predicates. [...] > > > Couldn't a `composite index' be supported by a key which is a tuple of > some number of references? I guess you're taking inspiration here from your own library's design. This implementation of composite indices simply doesn't fit here. You can consider things are "the other way around" with respect to indices and dependent keys: every index is composite. Those that happen to depend on a single member of the element can be conveniently built by resorting to less_by. I'm not sure I'm being clear enough. If not, please tell me so. On a side note, taking composite indices as indices over a tuple of members does not cover the full spectrum of what an index can do. Imagine an index whose associated comparison predicate depend on the result of an external function, say int f(const element& e); f
[boost] Re: Interest in multiindex_set?(again)
On Tuesday, July 8, 2003, at 12:25 AM, Joaquín Muñoz wrote: [SNIP] A related question: Should boost::multiindex::multiindex_set be raised into Boost namespace as boost::multiindex_set (or whatever its final name)? Seems the standard practice, but I think it is safer to ask first. [TRUNCATE] If the library will be insanely huge, like Regex, Graph, or Spirit, then it could go in its own namespace. If libraries of a similar domain are likely to be made (or have already been made), then all of those libraries could share a namespace (like Math, Numeric, I/O). If both of these conditions exist, then you could create your namespace within the domain namespace. Else, the library is fairly small and unique, so it could go in the top Boost namespace. (Didn't someone propose a "container" namespace? Maybe this class template can go there.) Daryle ___ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
[boost] Re: Interest in multiindex_set?(again)
Joaquín Mª López Muñoz wrote: > > This is a no-no policy. Collision can happen with more than one element. > Following this approach could result in an single update sweeping off > half the elements of the container :) I don't think users of the library want > this. What are the current semantics of your insert? In order to be consistent with std::map, the behavior has to be to overwrite (delete) an existing element with the same key. The number of elements deleted by an insert or modify or update operation under the semantics I suggest is at most the number of unique indices. These are the semantics I would expect anyway, although I suppose it is a matter of opinion. > I think the analysis is not precise enough. In the modify approach you're > proposing, modifying (by way of the user-provided functor) takes place *before > anything*. If this functor throws, we got nothing else to do but let the exception > > propagate. It is the responsibility of the user to guarantee that the modyfing > functor does not leave things in a bad state. > An acceptable (IMHO) approach for modify would be as follows: > > a. Call the user-provided modifying functor. If it throws, let the exception > propagate. Okay, now that I reconsider, this seems most consistent with standard containers, since it follows the idiom of "exception thrown means nothing happened." > b. Attempt reordering. > b1. If there's a collision, delete the modified element (this can be done in > a no-throw way) and return false I still think it is better to delete the older elements rather than the new one, consistent with my proposed semantics for insert (and consistent with standard container insert). > b2. If there's an exception (by some comparison predicate), delete the > modified > element and rethrow. Okay, this seems reasonable, and would be consistent with insert. > I'm strongly against key extraction: it complicates the instantiation of the > container, has no apparent (to me) usefulness and bans the use of composite > indices. Single-member indices are well served by the less_by utility. > Maybe someone can say something in favor of key extraction: otherwise, > it won't get in. There are various advantages to separating key extraction from key comparison: 1) It allows the use of more generic key extraction and key comparison functors. Specifically, for an ordered index, the key comparison function in many cases would be std::less, and can default to that. For "unordered" (hash-table-like) indices, it is necessary, at least internally, to separate key comparison and key extraction because the hash function must also operate on the "key" value; otherwise, the hash function must also be very specialized, when in many cases a default library-provided hash function can be used (and thus not specified). 2) It is necessary in order to support certain convenient interfaces; specifically, I think it is very convenient to allow the syntax: table[key], find(key), and equal_range(key). In order for the front-end to deteremine which index corresponds to the key type, it must "know" about keys. Note that having key extractors does not prohibit using a comarison functor that is integrated with key extraction, since the formal key extractor can be specified as an identity functor, thus the key_type is the value_type. Couldn't a `composite index' be supported by a key which is a tuple of some number of references? > As for the non-const iterator, I don't really know what the status of this > proposal for std::set is (can someone provide some pointer to info?) > My feeling is that if modify is finally provided, its computational cost in > case 3 is acceptable (it adds 2N comparisons where N is the number of > indices). So I wouldn't pursue the non-const iterator approach (and as > always you can resort to const_casting). http://std.dkuug.dk/jtc1/sc22/wg21/docs/lwg-defects.html#103 I rechecked the status of the proposal, and it appears that the LWG has decided to clarify that both set iterator and const_iterator are constant. The text describing the defect (not the resolution) is somewhat inconsistent with the actual resolution, and it appears that the defect report initially proposed as the resolution requiring a non-constant set iterator, but the LWG later changed the resolution to the one upon which they decided. Thus it appears your decision to have only constant iterators is correct. It occurs to me that a mechanism for the user to specify that reordering may be required in only certain indices as a result of a modify operation. This syntax could be used: table.modify(it, functor); // all indices are updated table.modify<0, 1, 3>(it, functor); // indices 0, 1, and 3 are updated table.modify(it, functor); // indices with tag1 are updated table.modify(it, functor); // indices with either tag1 or // tag2 are updated (The tagging m
Re: [boost] Re: Interest in multiindex_set?(again)
Jeremy Maitin-Shepard ha escrito: [...] > > How about indexed_table? This container is *not* a set, since there can > be duplicates, but it *does* resemble a relational database table. > > Then it can be defined in namespace boost::table, and additionally > promoted to namespace boost. Seems good to me, it'd be great to know others' opinion. [...] > > Actually, using partial specialization, you can avoid re-specifying the > default/unique. multiple would possibly be a better name than > non_unique. Additionally, I would recommend allowing a specification of > the default "uniqueness" of the indices. Seems good too :) I think I got an idea of how to implement this, I'll put myself to work on it. > > > > (I plan to add hash indices sometime by the end of 2003). > > A policy-based approach seems the best way to support different types of > indices. Currently, I am working on a policy-based equivalent to this > class which will support an "ordered" (binary search tree-based) index > and an "unordered" (hash-table-based) index, as well as any user-defined > index policies (which would allow a different hash table or tree > implementation). This should be completed within a week or two. > Do you intend to submit that to Boost? If so, then there's some collision between the two libraries, maybe we can cooperate somehow. If not, maybe you can lend some ideas to (re-baptised) indexed_table. Please drop me a mail if you feel like discussing this. > > > I don't think it is a question of how smartly update() is coded, but an > > actual conceptual impossibility. If we are to provide strong > exception-safety, > > then no fail or throw is permitted after the element has changed its > value. > > So, there are three scenarios: > > > > 1. Updating can possibly fail (because of collision with some other > element) or > > > > reorder the updated element: no other soluction but to incur a copy > of the > > element, as the actual updating has to take place *after* the > reordering is > > done. > > It seems the best behavior in the case of a collision is to delete the > old element with which there is a collision. This would be consistent > with the semantics of insert. This is a no-no policy. Collision can happen with more than one element. Following this approach could result in an single update sweeping off half the elements of the container :) I don't think users of the library want this. I think current update() has some value as it is now. I keep discussing the alternate memfun modify you propose below. > > > Instead of an update method, a modify method that takes an iterator and > a one-argument (a reference to value_type) functor which it executes on > the value to which the iterator points. Combined with boost.lambda and > boost.bind, this system can be very convenient. This is similar to what Fernando Cacciola proposed some days ago. > > > In the case that the modification functor (or copy constructor, in the > case of the existing update method) throws an exception, the semantics > could be one of the following: > > 1. The container is left in an undefined state, and the exception is > propogated to the caller. > > 2. The exception is caught, reordering is attempted; if an exception is > thrown while reordering, by one of the comparison, etc. functors, the > container is left in an undefined state and the exception is propogated > to the caller; if reordering succeeds, the original exception thrown the > by modification functor is rethrown to the caller. Thus, in either case > an exception is propogated to the caller, and by the type of the > exception whether the container is in an undefined state. > > 3. Same as (2), except that in the case that the reordering filas, > before the exception thrown by a comparison function is propogated to > the caller, the would-be-modified element is erased from the container, > leaving it in a well-defined state. The exception thrown by the > comparison function is still propogated to the user. > > (3) seems like the most desirable solution, thus that is the behavior I > would recommend. I think the analysis is not precise enough. In the modify approach you're proposing, modifying (by way of the user-provided functor) takes place *before anything*. If this functor throws, we got nothing else to do but let the exception propagate. It is the responsibility of the user to guarantee that the modyfing functor does not leave things in a bad state. An acceptable (IMHO) approach for modify would be as follows: a. Call the user-provided modifying functor. If it throws, let the exception propagate. b. Attempt reordering. b1. If there's a collision, delete the modified element (this can be done in a no-throw way) and return false b2. If there's an exception (by some comparison predicate), delete the modified element and rethrow. What dou you think about it? [...] > > > 3. Updating does not alter the order of the updated elemen
[boost] Re: Interest in multiindex_set?(again)
Joaquín Mª López Muñoz wrote: > If we forget about the namespace name, then I have no objections against > indexed_set (though std::sets are indexed by nature, but this is probably not > a common perception between users). > I thought long and hard about name candidates and come up with none except > multiindex_set. > If no one says anything against it, I'll change it to indexed_set. The problem > remains > of how to name its associated namespace. How about indexed_table? This container is *not* a set, since there can be duplicates, but it *does* resemble a relational database table. Then it can be defined in namespace boost::table, and additionally promoted to namespace boost. >> I don't see any acceptable (non-invasive) way to implement marking only one >> class. But the unique/non_unique two template approach seems perfectly >> acceptable. You could even argue that it forces the user to consciously >> make the critical unique/non_unique decision for each index, so may reduce >> user error. > > > I agree with you. The all-predicates-are-marked approach is even better in the > light > of future change Actually, using partial specialization, you can avoid re-specifying the default/unique. multiple would possibly be a better name than non_unique. Additionally, I would recommend allowing a specification of the default "uniqueness" of the indices. > (I plan to add hash indices sometime by the end of 2003). A policy-based approach seems the best way to support different types of indices. Currently, I am working on a policy-based equivalent to this class which will support an "ordered" (binary search tree-based) index and an "unordered" (hash-table-based) index, as well as any user-defined index policies (which would allow a different hash table or tree implementation). This should be completed within a week or two. > I don't think it is a question of how smartly update() is coded, but an > actual conceptual impossibility. If we are to provide strong exception-safety, > then no fail or throw is permitted after the element has changed its value. > So, there are three scenarios: > > 1. Updating can possibly fail (because of collision with some other element) or > > reorder the updated element: no other soluction but to incur a copy of the > element, as the actual updating has to take place *after* the reordering is > done. It seems the best behavior in the case of a collision is to delete the old element with which there is a collision. This would be consistent with the semantics of insert. Instead of an update method, a modify method that takes an iterator and a one-argument (a reference to value_type) functor which it executes on the value to which the iterator points. Combined with boost.lambda and boost.bind, this system can be very convenient. In the case that the modification functor (or copy constructor, in the case of the existing update method) throws an exception, the semantics could be one of the following: 1. The container is left in an undefined state, and the exception is propogated to the caller. 2. The exception is caught, reordering is attempted; if an exception is thrown while reordering, by one of the comparison, etc. functors, the container is left in an undefined state and the exception is propogated to the caller; if reordering succeeds, the original exception thrown the by modification functor is rethrown to the caller. Thus, in either case an exception is propogated to the caller, and by the type of the exception whether the container is in an undefined state. 3. Same as (2), except that in the case that the reordering filas, before the exception thrown by a comparison function is propogated to the caller, the would-be-modified element is erased from the container, leaving it in a well-defined state. The exception thrown by the comparison function is still propogated to the user. (3) seems like the most desirable solution, thus that is the behavior I would recommend. > 2. Updating won't fail (because the progammer knows it) but can result in > reordering. Here a no-throw update is possible that doesn't require an > additional copy. In this case, the modify method described above is sufficient, and the exception behavior is irrelevant. > 3. Updating does not alter the order of the updated element (because the > programmer knows it). One can just change the value in-place and forget > about it. A non-const iterator should be provided. (I believe this is what has been proposed for std::set) Requiring that const_cast is used to obtain a non-const value from an iterator seems like too strong a warning. In the case where the key for an index (assuming you separate key extraction from key comparison as has been recommended) is a particular member of the value, it is often rather obvious whether a particular operation will or will not modify the key. --- Jeremy Maitin-Shepard
Re: [boost] Re: Interest in multiindex_set?(again)
Beman Dawes ha escrito: [...] > I'm more interested in the class name than the namespace name. One problem > at a time. If you weren't worrying about the namespace name, would you then > like indexed_set as the class name? What are some other alternatives? If we forget about the namespace name, then I have no objections against indexed_set (though std::sets are indexed by nature, but this is probably not a common perception between users). I thought long and hard about name candidates and come up with none except multiindex_set. If no one says anything against it, I'll change it to indexed_set. The problem remains of how to name its associated namespace. [...] > > Yes, I think so. It is a bit of a judgement call, but since the library is > a very general purpose tool, and assuming the library only adds a few names > to the namespace it lives in, I'd rather see it in namespace boost. Others > may disagree; a lot of people like tall and deep namespace hierarchies even > when the components are very general purpose. OK, next installment will have the container promoted to boost::indexed_set. There are some auxiliary classes which I don't know exactly where to put, I'll think it over. > I don't see any acceptable (non-invasive) way to implement marking only one > class. But the unique/non_unique two template approach seems perfectly > acceptable. You could even argue that it forces the user to consciously > make the critical unique/non_unique decision for each index, so may reduce > user error. I agree with you. The all-predicates-are-marked approach is even better in the light of future change (I plan to add hash indices sometime by the end of 2003). [...] > > I understand the rationale for that, but am still concerned. There are many > applications where the cost of a copy to apply a minor update is > prohibitive. The user is then forced to add a layer of indirection, or use > a home-made container. Ugh. This is one of the few cases I run into where a > safe design is so inefficient compared to a traditional unsafe design that > in real production code I might be forced to use the unsafe design. That > makes me uncomfortable. I don't think it is a question of how smartly update() is coded, but an actual conceptual impossibility. If we are to provide strong exception-safety, then no fail or throw is permitted after the element has changed its value. So, there are three scenarios: 1. Updating can possibly fail (because of collision with some other element) or reorder the updated element: no other soluction but to incur a copy of the element, as the actual updating has to take place *after* the reordering is done. 2. Updating won't fail (because the progammer knows it) but can result in reordering. Here a no-throw update is possible that doesn't require an additional copy. 3. Updating does not alter the order of the updated element (because the programmer knows it). One can just change the value in-place and forget about it. 1 is taken care of by current multiindex_set::update. 3 can be solved with appropriate const_casting: this is dangerous as any cast is, but then again the programmer *knows* what she's doing. Furthermore, this is exactly the same situation as in std::set. We are left with 2: please note that it is extremely error-prone to trust the programmer in her implicit knowing that no collision will happen (checking 3, on the other hand, is much easier, as probably what the programmer is changing are auxiliary members that no comparison predicate depend on). Now, if the no-collision guarantee is assumed, the updating can be performed without throwing. To sum it up, as it stands now mutiindex_set allows for updating in scenarios 1 and 3. Scenario 2 (programmer knows the updated element won't clash with some other element) can be realized (Fernando suggested a nice way to do it), but my opinion is that this facility would be extremely dangerous to handle. I'd rather leave things as they are now, but I'm open to suggestions, maybe I'm missing something obvious that will solve the whole problem in a fell swoop. Open issues: * Name will change to indexed_set if no one complains. What name to use for the associated namespace? * Is it advisable to provide an no-copy update facility relying on the assumption that updating won't clash with other elements in the set? Joaquín M López Muñoz Telefónica, Investigación y Desarrollo ___ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Re: [boost] Re: Interest in multiindex_set?(again)
At 06:25 PM 7/7/2003, =?windows-1252?Q?JOAQUIN_LOPEZ_MU=3FZ?= wrote: >Hi Beman, > >- Mensaje Original - >[...] >> * The "multiindex_set" name seems awkward to me. Maybe >> "indexed_set" or >> "set_index"? > >I don't like the name either, and would be happy if someone comes >with something better. Nevertheless, I don't think indexed_set is a >good choice: when picking up a name another one must be chosen >for the associated namespace (and it is convenient that these two are >not the same, for reasons explained vg in Boost.Tuple docs). I >chose multiindex_set/multiindex. The analogous indexed_set/index >does not fit: index is too broad a denomination for the namespace. I'm more interested in the class name than the namespace name. One problem at a time. If you weren't worrying about the namespace name, would you then like indexed_set as the class name? What are some other alternatives? >A related question: Should boost::multiindex::multiindex_set be >raised into Boost namespace as boost::multiindex_set (or whatever >its final name)? Seems the standard practice, but I think it is safer >to ask first. Yes, I think so. It is a bit of a judgement call, but since the library is a very general purpose tool, and assuming the library only adds a few names to the namespace it lives in, I'd rather see it in namespace boost. Others may disagree; a lot of people like tall and deep namespace hierarchies even when the components are very general purpose. >> typedef boost::multiindex::multiindex_set< >> employee, >> boost::tuple< unique< employee::comp_id >, >> non_unique< employee::comp_name >, >> unique< employee_name::comp_ssnumber > >, >> > employee_set; >> > >This is definitely a good idea, Fernando also hinted at symplifing >the specification of unique/non-unique indices. Any suggestion on how >to implement it? With two templates for unique and non-unique indices >the thing is simple; it is not as simple when only one class is marked. >Suggestions here are most welcome. I don't see any acceptable (non-invasive) way to implement marking only one class. But the unique/non_unique two template approach seems perfectly acceptable. You could even argue that it forces the user to consciously make the critical unique/non_unique decision for each index, so may reduce user error. >[...] >> * It isn't clear (or I missed it) if iterators are always >> logically >> const-iterators or not. In other words, could your example have >> been >> written like this? >> >> typedef index_type::type employee_set_by_name; >> employee_set_by_name& i=es.get<1>(); >> >> employee_set_by_name::iterator it=i.find("Anna Jones"); >> it->name="Anna Smith"; //she just got married to Calvin Smith >> > >No it couldn't. Iterators and const_iterators (which incidentally are the >same) behave logically as const-iterators, just the same as in STL sets. >update() has been carefully crafted so that strong exception-safety is >provided: for that, it is necessary that a copy of the element is provided >instead of allowing some sort of in-place modification of the element. >This is discussed in some detail with Fernando somewhere up this thread. I understand the rationale for that, but am still concerned. There are many applications where the cost of a copy to apply a minor update is prohibitive. The user is then forced to add a layer of indirection, or use a home-made container. Ugh. This is one of the few cases I run into where a safe design is so inefficient compared to a traditional unsafe design that in real production code I might be forced to use the unsafe design. That makes me uncomfortable. >Thanks for your interest. If no objection is raised I plan to submit >the library for pre-formal review in a couple of weeks. Good! --Beman ___ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Re: [boost] Re: Interest in multiindex_set?(again)
Hi Beman, - Mensaje Original - [...] > * The "multiindex_set" name seems awkward to me. Maybe > "indexed_set" or > "set_index"? I don't like the name either, and would be happy if someone comes with something better. Nevertheless, I don't think indexed_set is a good choice: when picking up a name another one must be chosen for the associated namespace (and it is convenient that these two are not the same, for reasons explained vg in Boost.Tuple docs). I chose multiindex_set/multiindex. The analogous indexed_set/index does not fit: index is too broad a denomination for the namespace. A related question: Should boost::multiindex::multiindex_set be raised into Boost namespace as boost::multiindex_set (or whatever its final name)? Seems the standard practice, but I think it is safer to ask first. > > * It seems safer to me for the default to be indexes that are unique. > > * Separating the specification of uniqueness from the list of > indices > seems like a recipe for mistakes, particularly during maintenance. > Is it > possible to combine uniqueness specification with the index tuple? > Perhaps > your example could look like this: > > typedef boost::multiindex::multiindex_set< > employee, > boost::tuple< employee::comp_id, > multi< employee::comp_name >, > employee_name::comp_ssnumber >, > > employee_set; > > or this: > > typedef boost::multiindex::multiindex_set< > employee, > boost::tuple< unique< employee::comp_id >, > non_unique< employee::comp_name >, > unique< employee_name::comp_ssnumber > >, > > employee_set; > This is definitely a good idea, Fernando also hinted at symplifing the specification of unique/non-unique indices. Any suggestion on how to implement it? With two templates for unique and non-unique indices the thing is simple; it is not as simple when only one class is marked. Suggestions here are most welcome. [...] > * It isn't clear (or I missed it) if iterators are always > logically > const-iterators or not. In other words, could your example have > been > written like this? > > typedef index_type::type employee_set_by_name; > employee_set_by_name& i=es.get<1>(); > > employee_set_by_name::iterator it=i.find("Anna Jones"); > it->name="Anna Smith"; //she just got married to Calvin Smith > No it couldn't. Iterators and const_iterators (which incidentally are the same) behave logically as const-iterators, just the same as in STL sets. update() has been carefully crafted so that strong exception-safety is provided: for that, it is necessary that a copy of the element is provided instead of allowing some sort of in-place modification of the element. This is discussed in some detail with Fernando somewhere up this thread. > Anyhow, I'm glad to see you working on this, and hope you will > continue > with it. > Thanks for your interest. If no objection is raised I plan to submit the library for pre-formal review in a couple of weeks. Regards, Joaquín M López Muñoz Telefónica, Investigación y Desarrollo ___ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
[boost] Re: Interest in multiindex_set?(again)
At 03:15 PM 7/7/2003, Joaquín Mª López Muñoz wrote: >Please find some preliminary documentation for multiindex_set >at: > >http://groups.yahoo.com/group/boost/files/multiindex.zip Hi Joaquín, I took a quick look at multiindex.html, and found it quite interesting. It appears to meet a common need. Here are some specific (but not carefully thought out) comments: * The "multiindex_set" name seems awkward to me. Maybe "indexed_set" or "set_index"? * It seems safer to me for the default to be indexes that are unique. * Separating the specification of uniqueness from the list of indices seems like a recipe for mistakes, particularly during maintenance. Is it possible to combine uniqueness specification with the index tuple? Perhaps your example could look like this: typedef boost::multiindex::multiindex_set< employee, boost::tuple< employee::comp_id, multi< employee::comp_name >, employee_name::comp_ssnumber >, > employee_set; or this: typedef boost::multiindex::multiindex_set< employee, boost::tuple< unique< employee::comp_id >, non_unique< employee::comp_name >, unique< employee_name::comp_ssnumber > >, > employee_set; * The need for an update member function also comes up in other non-STL containers like B-trees, so don't worry that it is an add-on. * It isn't clear (or I missed it) if iterators are always logically const-iterators or not. In other words, could your example have been written like this? typedef index_type::type employee_set_by_name; employee_set_by_name& i=es.get<1>(); employee_set_by_name::iterator it=i.find("Anna Jones"); it->name="Anna Smith"; //she just got married to Calvin Smith Anyhow, I'm glad to see you working on this, and hope you will continue with it. --Beman ___ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Re: [boost] Re: Interest in multiindex_set? (again)
Arkadiy Vertleyb ha escrito: [...] > > The reason we are interested in the multiindex_set is that we want to > provide a more efficient implementation for our table, that is currently > implemented with std::vector, although Ed Brey already suggested an > alternative with std::set(that we need to examine). > > Regards, > > Arkadiy > I suggest you download multiindex.zip from the files section and play with it to see if it fits your needs. If you need assistance with the usage of the library don't hesitate to contact me. Next week some sort of preliminary documentation should be prepared, anyway. Regards, Joaquín M López Muñoz Telefónica, Investigación y Desarrollo ___ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
[boost] Re: Interest in multiindex_set? (again)
I am just re-posting this response, since my first attempt to Cc it didn't work: The "Relational Template Library" is not a part of Boost (although we would eventially like it to be). We introduced it about a year ago, and got little response at that time. The library then included a relational table (capable of storing any objects in its fields in a typesafe manner), and basic relational operators. We were critisized then for the lack of indexing. We added indexing capability about half a year ago. We had some positive response from Boost, but eventially the discussion, again, died away. Although a few people seemed to be interested, no one apparently has really used the library or tried it out. We also wrote an article for CUJ, and got positive response from them, too. If I understand them correctly, they did accept the article for publication, but put it on hold for now... Right at this moment we are in the middle of the third round of development. We are going to introduce basic transactions. Along with this also comes automatic index rebuild. That means you can have an arbitrary complicated expression, and an index on top of it (we allow to index any expession, not only tables), then change any tables this expression is based upon, and have the index automatically (and incrementally) updated, so you can call begin() on it, and start using it right away (of course iterators, as always, get invalidated). We also met some compiler limitations related to the template tree size, and found an interesting way to deal with it. We are planning to come up with a new version (that would include the above capabilities) within the next month or two. The reason we are interested in the multiindex_set is that we want to provide a more efficient implementation for our table, that is currently implemented with std::vector, although Ed Brey already suggested an alternative with std::set(that we need to examine). Regards, Arkadiy -Original Message- From: Darren Cook [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 02, 2003 3:42 AM To: [EMAIL PROTECTED] Subject: Re: [boost] Re: Interest in multiindex_set? (again) In boost mailing list you wrote: We may use this to provide a better table implementation for the Relational Template Library, we are currently working on. Is that "Relational Template Library" part of Boost? (or going to be). Where can I learn more about what you have planned? Darren _ Add photos to your e-mail with MSN 8. Get 2 months FREE*. http://join.msn.com/?page=features/featuredemail ___ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
[boost] Re: Interest in multiindex_set? (again)
Arkadiy wrote: > Right now we are using a sorted vector instead of a set, to implement > our relational tables, because set doesn't allow us to search on a > prefix of a key. Like if a table is indexed on a, b, c, we are not > able to use equal_range on a, b with the set. I was able to successfully use a set for the indices into an in-memory database I wrote for my company a few years back. By providing a custom Compare object, I was able to implement prefix-based seeking using equal_range. I allowed an element to indicate that it is truncated (e.g. contains only a and b, but not c). The Comparison function used the indication when checking whether one element is less than another. ___ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
[boost] Re: Interest in multiindex_set? (again)
We may use this to provide a better table implementation for the Relational Template Library, we are currently working on. Right now we are using a sorted vector instead of a set, to implement our relational tables, because set doesn't allow us to search on a prefix of a key. Like if a table is indexed on a, b, c, we are not able to use equal_range on a, b with the set. Sorted vector allows us to do this, but sorted vector has obvious problems with inserts/deletes. Do you think your library can provide both search on a key prefix and fast inserts/deletes? Arkadiy "JOAQUIN LOPEZ MU?Z" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] Hi all, A few weeks ago I prompted for interest in a multiply indexed set container, whth alas little response from the community. In the meanwhile, I continued working in multiindex_set and now the library is 90% completed, save docs. It is available at http://groups.yahoo.com/group/boost/files/multiindex.zip and have been checked to compile with MSVC++ 6.5 and gcc 3.2 under cygwin. IMHO multiindex_set is an useful resource and can help programmers when they are in need for data structures more complex than simple std::sets and std::multisets. In these cases multiindex_set is a more robust solution than the usual resort to ad hoc constructs based on containers of iterators and stuff like that. I am aware the complete lack of docs is a big barrier to the evaluation of the library, so the following serves as a basic introduction to multiindex_set that hopefully can motivate the reader into trying the library. An extensive example of use is provided in the URL above showing the whole set of capabilities offered by the container. * Usage: As std::set takes a comparison predicate to index the elements contained, multiindex_set is instantiated with a boost::tuple of such predicates, which we call indices. For instance, say we have class employee with sorting predicates based on some id (upon which employe::operator < is written), name and age. One can construct a set providing indices to all three orderings like this: typedef multiindex_set< employee, tuple< std::less, less_by, less_by > > employee_set; less_by is a facility intended to help construct a sorting predicate based on a single member of the class. In general, one can use whatever predicate with the right signature. Unless explictly stated the fist index (std::less in our example) is taken to be *unique*, while all others allow for duplicates. This means one can insert employees with the same age but not the same id. multiindex_set allows for specification of zero, one or more unique indices, much in the same way as a relational table does. So, the following constructs a multiindex_set with unique indices for id and social security number: typedef multiindex_set< employee, tuple< std::less, less_by, less_by, less_by >, unique_indices<0,3> // id and ss number > employee_set; Once constructed, multiindex_set holds as many internal orderings as indices are specified. Each index has the same interface as a std::set container. By itself, multiindex_set inherits the functionality of the first (primary) index. employee_set es; // retrieves a "view" based on the first index (name) employee_set::index_type<1>::type& i1=es.get<1>(); // ordered by id std::cout<<"by id"<(std::cout)); // ordered by name std::cout<<"by name"<(std::cout)); * Special lookup operations: When elements are expensive to construct, it is a nuissance to have to build a whole object just to search for a particular member of it. multiindex_set duplicates the std::set lookup facilities to help you perform searchs without supplying entire objects: i1.find("John"); // find John using the name index For this to work the predicates must be extended to accept the appropriate subobjects (less_by automatically does this). One can even pass in "compatible" sorting predicates, where compatible means "inducing a weaker compatible ordering", as the following example shows: i1.lower_bound('J',employee::comp_initial() // first employee whose name begins with J, using the // name index and the special employee::comp_initial predicate. * Update: given an element pointed to by a multiindex_set iterator, one can change its contents via the update member function: employee_iterator it=es.find(0); // employee with id==0 employee e=*it; e.id=100; // change her ID es.update(it,es); // and update the record. updating is done efficiently (constant time) if the ordering of the updated element remains the same with respect to all indices, and logarithmically for other cases. Interestingly enough, iterators and references to an element remain valid after updating. update provides strong exception-safety, in the sense of Stroustrup's TC++PL sp. ed. * Implementation: multiindex_set has been written from scatch with Boost in mind,
[boost] Re: Interest in multiindex_set? (again)
This looks fine in general. I've needed something like it so I'm intereseted on seeing this on boost. Some issues: (1) Why 'index' instead of 'key'? Associative containers use the term key instead of index, since index is typically related with random access instead of look up access. (2) I would explore a friendlier interface. For instance: (2.1) key uniqueness could be specified along with the key itself and not with a separate template parameter. (2.2) keys could be tagged so that 'get' could be parametrized by key tag instead of number. (3) This is a showstopper for me: The key mutating operations should use a higher level interface which protects clustering and ordering invariants. With the current interface, if I understood correctly, I could forget to call update() which would break such invariants. A better setup would be to use something along the lines of the 'modifiers' used by the CGAL Library. Check out the section named "Protected Access to Internal Representations" on this page: "http://www.cgal.org/Manual/doc_html/frameset/fsSupport.html"; Best, Fernando Cacciola ___ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost