[boost] Re: Interest in multiindex_set?(again)

2003-07-16 Thread Daryle Walker
On Monday, July 14, 2003, at 9:12 PM, Joaquín M López Muñoz wrote:

Well, to sum it up, these are the public names of the library:

0 multiindex_set (haven't decided on the final name, still doubting 
between indexed_set and indexed_table).
1 swap, equality and comparison operators
2 index_type
3 get (function to retrieve refs to an index of a given multiindex_set)
4 iterator_type
5 const_iterator_type
6 project (function to convert between different indices's iterators)
7 index_list (a type list for specifying the indices composing a 
multiindex_set)
8 unique and non_unique (index traits classes that go into the 
index_list)

get() and project() are sufficiently well specified that they won't 
clash with similar overloads belonging to namespace boost. But the 
rest of names, which are classes instead of functions, are 
problematic: their names are too general to be safely put into 
namespace boost without risk of collision. So I guess some namespace 
have to be set up for them. Besides, there's the problem of the yet to 
be written member<> utility (that you're already familiar with). This 
clearly is a very general utility that has no particular ties to 
multiindex_set, so I don't know where it should go.

So the two problems are:

* Where to put 2,4,5,7, and 8? For known reasons, 
boost::multiindex_set is not a good candidate (problems with having a 
class and a namespace named the same).
* Where to put member<>?
2. 4. 5. 7.	Couldn't these be internal typedef's in the multi-index set 
class?  (If you meant the original definitions, I don't know since I 
don't have a copy of your code right now.)

8.	Could go with the original definition of [7].

member: Could go with the other functional stuff in Boost, but I don't 
think there is a unified domain for that stuff right now.  The 
functional stuff is spread all over the top-level namespace, except for 
Boost.Lambda (which uses its own exclusive sub-namespaces).

Daryle

___
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Re: [boost] Re: Interest in multiindex_set?(again)

2003-07-14 Thread Joaquín Mª López Muñoz


Daryle Walker ha escrito:

> On Saturday, July 12, 2003, at 9:21 PM, Joaquín M López Muñoz wrote:
>
> > Hi again,
> >
> > - Mensaje Original -
> > De: Fernando Cacciola <[EMAIL PROTECTED]>
> > Fecha: Sábado, Julio 12, 2003 7:32 pm
> > Asunto: [boost] Re: Re: Interest in multiindex_set?(again)
> >
> > [stuff about conceptual structure of multtindex_set deleted]
> >
> > OK, I'm glad we finally got to understand each other :) There's a
> > problem with the name of the class. Others have expressed dislike for
> > "multtindex_set". Alternative candidates are "indexed_set" and
> > "indexed_table". I haven't decided yet for one, plus there's the
> > problem of which namespace should this live in (regardless of whether
> > it is promoted to namespace boost later). The alternatives so far are
> > (name of the class/associated namespace)
> >
> > * multiindex_set/boost::multiindex
> > * indexed_set/??
> > * indexed_table/??
> > * ??/boost::container (proposed by Daryle)
> >
> > boost::container I don't like because some of the associated small
> > utility classes and functions (less_by, get, project) shouldn't really
> > belong into a general-purpose namespace like container which is
> > supposed to hold other contributions. Also, there's the additional
> > problem that the class and the namespace shouldn't be named the same
> > (it makes some compilers choke, this has been discussed in connection
> > with Boost.Tuple). Suggestions in this area are most welcome.
> [TRUNCATE]
>
> If the small utility classes are sufficiently independent from your
> main classes, then put them in separate (possibly unrelated)
> namespaces.  I don't we've ever reviewed a multi-domain package,
> though.  Or we can review the utility parts separately, first.
>
> Daryle

Well, to sum it up, these are the public names of the library:

0 multiindex_set (haven't decided on the final name, still doubting
between indexed_set and indexed_table).
1 swap, equality and comparison operators
2 index_type
3 get (function to retrieve refs to an index of a given multiindex_set)
4 iterator_type
5 const_iterator_type
6 project (function to convert between different indices's iterators)
7 index_list (a type list for specifying the indices composing a multiindex_set)
8 unique and non_unique (index traits classes that go into the index_list)

get() and project() are sufficiently well specified that they won't clash with
similar overloads belonging to namespace boost. But the rest of names,
which are classes instead of functions, are problematic: their names are too
general to be safely put into namespace boost without risk of collision.
So I guess some namespace have to be set up for them.
Besides, there's the problem of the yet to be written member<> utility (that
you're already familiar with). This clearly is a very general utility that has no
particular ties to multiindex_set, so I don't know where it should go.

So the two problems are:

* Where to put 2,4,5,7, and 8? For known reasons, boost::multiindex_set
is not a good candidate (problems with having a class and a namespace named
the same).
* Where to put member<>?

Joaquín M López Muñoz
Telefónica, Investigación y Desarrollo

>
>
> ___
> Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

___
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


[boost] Re: Interest in multiindex_set?(again)

2003-07-12 Thread Daryle Walker
On Saturday, July 12, 2003, at 9:21 PM, Joaquín M López Muñoz wrote:

Hi again,

- Mensaje Original -
De: Fernando Cacciola <[EMAIL PROTECTED]>
Fecha: Sábado, Julio 12, 2003 7:32 pm
Asunto: [boost] Re: Re: Interest in multiindex_set?(again)
[stuff about conceptual structure of multtindex_set deleted]

OK, I'm glad we finally got to understand each other :) There's a 
problem with the name of the class. Others have expressed dislike for 
"multtindex_set". Alternative candidates are "indexed_set" and 
"indexed_table". I haven't decided yet for one, plus there's the 
problem of which namespace should this live in (regardless of whether 
it is promoted to namespace boost later). The alternatives so far are 
(name of the class/associated namespace)

* multiindex_set/boost::multiindex
* indexed_set/??
* indexed_table/??
* ??/boost::container (proposed by Daryle)
boost::container I don't like because some of the associated small 
utility classes and functions (less_by, get, project) shouldn't really 
belong into a general-purpose namespace like container which is 
supposed to hold other contributions. Also, there's the additional 
problem that the class and the namespace shouldn't be named the same 
(it makes some compilers choke, this has been discussed in connection 
with Boost.Tuple). Suggestions in this area are most welcome.
[TRUNCATE]

If the small utility classes are sufficiently independent from your 
main classes, then put them in separate (possibly unrelated) 
namespaces.  I don't we've ever reviewed a multi-domain package, 
though.  Or we can review the utility parts separately, first.

Daryle

___
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


[boost] Re: Interest in multiindex_set?(again)

2003-07-12 Thread Jeremy Maitin-Shepard
Fernando Cacciola wrote:
JOAQUIN LOPEZ MU?Z wrote:
[...]

Another thing I couldn't figure out is how to compose indices
hierachically.That is, how to reproduce a typical SQL SELECT *
ORDER BY X,Y,Z.
Since you're modeling an indexed table, this functionality should be
supported.
This is an interesting subject to study, but as some others are
already
writing a relational framework I guess we should limit here to the
basics of the container.
SQL-like predicates can be built on top of it
in the context of more ambitious libraries.
I see.
If this functionality can be built on top of it,
then I think it is OK not to support it explicitely.
If the library supports key extraction, for example, the the key 
extractor could simply return a tuple of references to e.g. the X, Y, 
and Z members of the value.  With the default std::less comparison 
predicate, this would provide the desired lexigraphical ordering.

[...]

The most usual situation for specification of an index (standard
less_than comparison on some member of the element) is already
covered by less_by. Furthermore, less_by accepts an additional
template parameter for specifying a comparison predicate other
than std::less. For instance
 less_by

specifies an index based on less_than comparison of employee::name
(a string). If you want to use some other comparison criterium on
empoyee::name (for instance, case insensitive comparison), you can
write
less_by

Is this what you're referring to?

Half of it.
I see that I can use the precanned 'less-by' and specify a different comparison 
semantic (though in this case why is it named
'less_by' instead of 'order_by'), but I don't see how can I use it with something 
different than a data member as the key
itself. On some applications, specially outside databases, the keys are given by 
runtime expressions and not just stored data
members.
If the users would have to end up writting their own predicates, the separaton will 
make that task a lot easier.
Effectively, less_by to some extent *is* separating key extraction from 
comparison.  It would be implemented to a greater extent if the 
additional functor were provided that was a more generic version of 
less_by, where the first argument is the key extractor functor. 
However, there are certain advantages, as I have discussed in previous 
posts, to integrating this separation into the table class directly.

[...]

If you want restoring-on-collision, use update(). To sum it up:

* update() does restore-on-collision, with the overhead of an element
copy.
* modify() does not incur an element copy, but on collision the
modified element is erased.
I don't think we have other options here. I've tried to model
modify() following your request and those of others, and the approach
is IMHO forced
by design. Anyway, I'm open to new ideas.
OK, this is the best we could think of, at least right now.
Many times the user knows exactly what she's doing so that no collisions
will ever really ocurr. For example, if the modify does not change keys
but other data.
For this cases modify() is a big plus. The acutal user code could assert
the result of modify() so that a coallision is treated as postcondition
violation.
>
>
> The documentation could express emphatically that if the modification
> changed keys, update() must be used, else, modify() can.
If the keys are not modified, a modify method is not needed.  The idea 
of the modify method is to fix ordering as needed as a result of 
changing a key.  Additionally, versions of modify could be provided that 
allow the user to specify, either by index number or a index tag (see 
below), in which indices reordering might be required, thus saving the 
cost of determining that in other indices, which the user knows will not 
be modified, no reordering is required.  This functionality is important 
for efficiency.

Ideally, a convenient syntax could be devised for specifying that no 
reordering will be required as a result of a modify operation.  The 
problem is that it seems that, for instance:

modify(it, functor); // should mean reorder in all indices
modify<3>(it, functor); // reorder only 3
modify(it, functor); // reorder only indices with TagType tag
Possibly, the way to do that would be to specify a tag-type that is not 
used by any index.  E.g. if void is unused, then:

modify(it, functor); // no reordering required

Instead of this syntax, the alternative is to make the iterators 
non-constant.  This is inconsistent with standard library associative 
containers (although this is not really an associative container); it 
does, however, allow for the use of standard algorithms that require 
non-constant iterators (but which will not modify the elements such that 
reordering would be necessary) without an adaptor.  If non-constant 
iterators are not provided, it seems it would be useful to provide a 
const_casting iterator adaptor.

A possibility is to protect modify() so that it can used but not accidentally.
That is, we can prevent a

Re: [boost] Re: Interest in multiindex_set?(again)

2003-07-12 Thread JOAQUIN LOPEZ MU?Z
Hi Fernando,

- Mensaje Original -
De: Fernando Cacciola <[EMAIL PROTECTED]>
Fecha: Sábado, Julio 12, 2003 1:22 am
Asunto: [boost] Re: Interest in multiindex_set?(again)

> Hi Joaquín,
> 
> Unfortunately, I douldn't compile the code with BCC because it 
> extensivelyuses non-type template parameters which are only 
> partially supported on
> Borland.
> 
> After having read the documentation and looked at the code again, 
> I realize
> now that your class is really a database-like table but not an 
> associativecontainer. In fact, it is not even a container.
> The problem is that you cannot traverse over _all_ the elements 
> linerly.(See 23.1, Table 65)

Yes you can, see below.

> I previously took for granted that it was a container so I asked 
> about the
> _global_ arrangement of the elements.
> I think that you shoud add iterators to traverse over the entire 
> table as if
> it were a Sequence. This is the basic requirement of a container.
> 

With dismay I realize how very poor my documentation skills are :)
multiindex_set *is* a container in the sense you describe. Given
a multiindex_set instantiation say my_mx_set witn N indices,
each index provides a set-like interface with allows the user to
traverse *all the elements* contained:

my_mx_set mxs;
...
my_mx_set::index_type::type& index_n=
  mxs.get(); // n between 0 and the N-1

Now, index_n.begin() and index_n.end() let you enumerate *all* the
elements in mxs, regardless of n; the difference is in the order
they are enumerated, which depends on the comparison predicate
associated with index #n. By convention, my_mx_set inherits the
functionailty of index #0, that is

my_mx_set::some_memfun(...);

is the same as

my_mx_set::index_type<0>::type::some_memfun(...);

In particular, my_mx_set::begin() and my_mx_set::end() let
you traverse through then entire set of elements contained,
with the order induced by index #0. Does this ask your question?
If not, please let me know and I'll be happy to explain it further.
I'm most determined to eliminate this semantic barrier between
both of us :)

Also, may I suggest you download some distribution of g++ and
play with the library a little. Maybe then things become clearer.

> Now, suppose that you can iterate over the all of the elements: 
> what's the
> order in which elements will appear w.r.t the insertion sequence 
> and the
> ordering implied by the indices? This is what I've asked you about the
> "ordering and clustering invariants" of the data structure.
> 
> If, during a linear traversal (that is, iterating over all of the 
> elementsas if it were a sequence), the elements will appear in an 
> unspecified order,
> then the data structure is not an associative container (much less 
> a set),
> so it should _definitely_ be called 'table' and not 'set'.
> 
> I think I understand now why the term 'index'. It reseambles the
> filtering-key associated with the 'field' which designates a 
> 'column' on a
> database table. However, if I'm not mistaken, the filtered-column 
> itself,that is, the sequence of elements with such value in the 
> correspondingfield, is not an 'index'. So the subsquence which is 
> mapped to a given index
> shouln't be called index, I think. I would call it a 'cluster' as I
> suggested before.
> 
> Since I can't play with the code I couldn't see how much of a set 
> is each
> 'cluster'. Make sure to check with the requirements and document this
> clearly.
> 

Hopefully, this has been asked above. Again, please let me know otherwise
so that I try to make myself clearer.

> Another thing I couldn't figure out is how to compose indices 
> hierachically.That is, how to reproduce a typical SQL SELECT * 
> ORDER BY X,Y,Z.
> Since you're modeling an indexed table, this functionality should be
> supported.

This is an interesting subject to study, but as some others are already
writing a relational framework I guess we should limit here to the
basics of the container. SQL-like predicates can be built on top of it
in the context of more ambitious libraries.

> 
> I vote for a separate KeyExtraction because:
> 
> (a) Key extraction and key comparison are conceptually orthogonal.
> 
> (b) The orthogonality will be important in practice when users 
> want to
> compare whith other predicates besides less_than and/or based on 
> more than
> just data members.
> With your mixin approach, it would be more difficult to have an 
> index based
> on a runtime expression
> (such as a hash value) combined with a comparison other than <.

The most usual situation for specification of an index (standard
less_than comparison on some

[boost] Re: Interest in multiindex_set?(again)

2003-07-11 Thread Fernando Cacciola
Hi Joaquín,

Unfortunately, I douldn't compile the code with BCC because it extensively
uses non-type template parameters which are only partially supported on
Borland.

After having read the documentation and looked at the code again, I realize
now that your class is really a database-like table but not an associative
container. In fact, it is not even a container.
The problem is that you cannot traverse over _all_ the elements linerly.
(See 23.1, Table 65)
I previously took for granted that it was a container so I asked about the
_global_ arrangement of the elements.
I think that you shoud add iterators to traverse over the entire table as if
it were a Sequence. This is the basic requirement of a container.

Now, suppose that you can iterate over the all of the elements: what's the
order in which elements will appear w.r.t the insertion sequence and the
ordering implied by the indices? This is what I've asked you about the
"ordering and clustering invariants" of the data structure.

If, during a linear traversal (that is, iterating over all of the elements
as if it were a sequence), the elements will appear in an unspecified order,
then the data structure is not an associative container (much less a set),
so it should _definitely_ be called 'table' and not 'set'.

I think I understand now why the term 'index'. It reseambles the
filtering-key associated with the 'field' which designates a 'column' on a
database table. However, if I'm not mistaken, the filtered-column itself,
that is, the sequence of elements with such value in the corresponding
field, is not an 'index'. So the subsquence which is mapped to a given index
shouln't be called index, I think. I would call it a 'cluster' as I
suggested before.

Since I can't play with the code I couldn't see how much of a set is each
'cluster'. Make sure to check with the requirements and document this
clearly.

Another thing I couldn't figure out is how to compose indices hierachically.
That is, how to reproduce a typical SQL SELECT * ORDER BY X,Y,Z.
Since you're modeling an indexed table, this functionality should be
supported.

I vote for a separate KeyExtraction because:

(a) Key extraction and key comparison are conceptually orthogonal.

(b) The orthogonality will be important in practice when users want to
compare whith other predicates besides less_than and/or based on more than
just data members.
With your mixin approach, it would be more difficult to have an index based
on a runtime expression
(such as a hash value) combined with a comparison other than <.

Finally, I'm not entirely happy with the coallision response of 'modify' (or
maybe I don't understand it).
Is it ever possible to afford removing the colliding modified element?
Imagine I change the Social Security Number of Jhon, and by mistake, I
assign him
the same number as Carla... the modify() will return 'false', but there
won't be a
Jhon anymore in the table

Best,

Fernando Cacciola





___
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Re: [boost] Re: Interest in multiindex_set?(again)

2003-07-11 Thread Joaquín Mª López Muñoz


Jeremy Maitin-Shepard ha escrito:

[...]

> > find(key) and related search memfuns are already supported in the way
> > you suggest! Look for "special lookup operations" in the documentation.
> > Matter of fact, the operations provided are even more powerful in that
> > they allow for alternative arguments and comparison predicates.
>
> It does not appear, however, that they support automatically detecting
> the index based on the type of the key.  This functionality seems
> useful, although in general it is only a syntactic convenience..
> Specifically, if a table has two indices, one with a key type of int,
> the other with key type of std::string, with this functionality, the
> following syntax could be used:  (note that the operator [] I show here
> could not have the semantics of std::map; rather it would need to throw
> an exception or have undefined behavior if an existing element is not found)

No, I guess my design is very far away from having the ability to provide
automatic index selection based on the arg type. Nevertheless, this ability
wouldn't be so useful when you have severalmembers with the same type, which
is expectedly not uncommon.
Anyway, I underestand the lack of this feature is not a showstopper for you.

>
>
> table["somekey"].whatever;
>
> The above statement would use the std::string index, because the type of
> the argument (char const *) is convertible to std::string.
>
> table[3].whatever

I did something similar in the past in a bidirectional map of mine, but here the
stituation is far more complicated. Unless some brilliant mind comes here
with an implementation approach, this feature won't go into the library.

>
>
> The above statement would use the int index, because the type of the
> argument is int.
>
> A similar syntax could be used for find, equal_range, count, erase.
> Additionally, something like the following could be supported, where
> table_type is the type of the table:
>
> table_type::index_with_key::iterator it =
> table.find("something");

Same comment as above, clashes in index deduction are likely to
occur for moderately coplex tables having indices over elements with
the same type. Tagged indices, OTOH, are nice to have. Please
read at the bottom of this post.

>
>
> Note that you provide a version of equal_range that is not provided by
> std::map or set, namely, a two-argument version, with the semantics of
> returning the range between the two arguments.  It is not possible to
> implement this for a hash table index.  Under a policy-based interface,
> this could be supported by the "view" of an index that is a binary
> search tree, while not being supported by the table class itself.

This is an error in the docs :) I fixed the example a few days ago, please
download the .zip again. Actually, equal_range has a standard signature,
just like other assoc containers.


>
>
> Similarly, finding using a weaker ordering predicate is only supported
> by binary-search-tree-based indices, and thus it seems this should be
> supported on a per-index basis only.
>
> Even if it is decided that automatically detecting the index based on
> the key type is not useful, I think it is advantageous to decouple key
> extraction from key comparison.
>
> >>Couldn't a `composite index' be supported by a key which is a tuple of
> >>some number of references?
> >
> >
> > I guess you're taking inspiration here from your own library's design. This
> > implementation of composite indices simply doesn't fit here. You can consider
> > things are "the other way around" with respect to indices and dependent keys:
> > every index is composite. Those that happen to depend on a single member
> > of the element can be conveniently built by resorting to less_by. I'm
> > not sure I'm being clear enough. If not, please tell me so.
> >
> > On a side note, taking composite indices as indices over a tuple of
> > members does not cover the full spectrum of what an index can do. Imagine an
> > index whose associated comparison predicate depend on the result of an
> > external function, say
> >
> > int f(const element& e);
> >
> > f() won't accept a tuple of members of element, but the actual element. To make
> > it fit one would be forced to construct a new element on the fly out of the
> > provided parts.
>
> Although you have not specified the semantics of f(), it seems that f()
> is effectively a key extractor, and the key type would be int.  In order
> to use this with your current system of integrating key extractor with
> key comparison, an additional functor would be needed that effectively
> composed this function f() with some comparison predicate.  In order to
> allow the user to use an int value (rather than an element value) as a
> key, this composition functor would also have to provide a number of
> overloads some of which bypass this key extraction functor for one of
> the arguments.  For a hash-table-based index, it becomes even more
> complicated, because a second "composition" fu

[boost] Re: Interest in multiindex_set?(again)

2003-07-10 Thread Jeremy Maitin-Shepard
Joaquín Mª López Muñoz wrote:
The semantics of std::map (or any other STL associative container) are
not overwriting on insertion: instead, if an equivalent element is found, no
insertion occurs. This is what my library does, too. Maybe you're confusing
here map::operator[] with map::insert.
Yes, I was confused.  Your semantics are correct.

There are various advantages to separating key extraction from key
comparison:
1) It allows the use of more generic key extraction and key comparison
functors.  Specifically, for an ordered index, the key comparison
function in many cases would be std::less, and can default to
that.  For "unordered" (hash-table-like) indices, it is necessary, at
least internally, to separate key comparison and key extraction because
the hash function must also operate on the "key" value; otherwise, the
hash function must also be very specialized, when in many cases a
default library-provided hash function can be used (and thus not specified).


A small utility class named less_by is used for precisely this purpose. You can
see it in action in the example accompanying the library. As for hashing, an
analogous device can be written (when hashes get included).

2) It is necessary in order to support certain convenient interfaces;
specifically, I think it is very convenient to allow the syntax:
table[key], find(key), and equal_range(key).  In order for the front-end
to deteremine which index corresponds to the key type, it must "know"
about keys.


find(key) and related search memfuns are already supported in the way
you suggest! Look for "special lookup operations" in the documentation.
Matter of fact, the operations provided are even more powerful in that
they allow for alternative arguments and comparison predicates.
It does not appear, however, that they support automatically detecting 
the index based on the type of the key.  This functionality seems 
useful, although in general it is only a syntactic convenience.. 
Specifically, if a table has two indices, one with a key type of int, 
the other with key type of std::string, with this functionality, the 
following syntax could be used:  (note that the operator [] I show here 
could not have the semantics of std::map; rather it would need to throw 
an exception or have undefined behavior if an existing element is not found)

table["somekey"].whatever;

The above statement would use the std::string index, because the type of 
the argument (char const *) is convertible to std::string.

table[3].whatever

The above statement would use the int index, because the type of the 
argument is int.

A similar syntax could be used for find, equal_range, count, erase.
Additionally, something like the following could be supported, where 
table_type is the type of the table:

table_type::index_with_key::iterator it = 
table.find("something");

Note that you provide a version of equal_range that is not provided by 
std::map or set, namely, a two-argument version, with the semantics of 
returning the range between the two arguments.  It is not possible to 
implement this for a hash table index.  Under a policy-based interface, 
this could be supported by the "view" of an index that is a binary 
search tree, while not being supported by the table class itself.

Similarly, finding using a weaker ordering predicate is only supported 
by binary-search-tree-based indices, and thus it seems this should be 
supported on a per-index basis only.

Even if it is decided that automatically detecting the index based on 
the key type is not useful, I think it is advantageous to decouple key 
extraction from key comparison.

Couldn't a `composite index' be supported by a key which is a tuple of
some number of references?


I guess you're taking inspiration here from your own library's design. This
implementation of composite indices simply doesn't fit here. You can consider
things are "the other way around" with respect to indices and dependent keys:
every index is composite. Those that happen to depend on a single member
of the element can be conveniently built by resorting to less_by. I'm
not sure I'm being clear enough. If not, please tell me so.
On a side note, taking composite indices as indices over a tuple of
members does not cover the full spectrum of what an index can do. Imagine an
index whose associated comparison predicate depend on the result of an
external function, say
int f(const element& e);

f() won't accept a tuple of members of element, but the actual element. To make
it fit one would be forced to construct a new element on the fly out of the
provided parts.
Although you have not specified the semantics of f(), it seems that f() 
is effectively a key extractor, and the key type would be int.  In order 
to use this with your current system of integrating key extractor with 
key comparison, an additional functor would be needed that effectively 
composed this function f() with some comparison predicate.  In order to 
allow the user to use an int val

Re: [boost] Re: Interest in multiindex_set?(again)

2003-07-10 Thread Joaquín Mª López Muñoz


Jeremy Maitin-Shepard ha escrito:

> Joaquín Mª López Muñoz wrote:
>
>  >
>  > This is a no-no policy. Collision can happen with more than one element.
>  > Following this approach could result in an single update sweeping off
>  > half the elements of the container :) I don't think users of the
> library want
>  > this.
>
> What are the current semantics of your insert?  In order to be
> consistent with std::map, the behavior has to be to overwrite (delete)
> an existing element with the same key.  The number of elements deleted
> by an insert or modify or update operation under the semantics I suggest
> is at most the number of unique indices.  These are the semantics I
> would expect anyway, although I suppose it is a matter of opinion.

The semantics of std::map (or any other STL associative container) are
not overwriting on insertion: instead, if an equivalent element is found, no
insertion occurs. This is what my library does, too. Maybe you're confusing
here map::operator[] with map::insert.


>
>
>  > I think the analysis is not precise enough. In the modify approach you're
>  > proposing, modifying (by way of the user-provided functor) takes
> place *before
>  > anything*. If this functor throws, we got nothing else to do but let
> the exception
>  >
>  > propagate. It is the responsibility of the user to guarantee that the
> modyfing
>  > functor does not leave things in a bad state.
>  > An acceptable (IMHO) approach for modify would be as follows:
>  >
>  > a. Call the user-provided modifying functor. If it throws, let the
> exception
>  > propagate.
>
> Okay, now that I reconsider, this seems most consistent with standard
> containers, since it follows the idiom of "exception thrown means
> nothing happened."
>
>  > b. Attempt reordering.
>  > b1. If there's a collision, delete the modified element (this can
> be done in
>  > a no-throw way) and return false
>
> I still think it is better to delete the older elements rather than the
> new one, consistent with my proposed semantics for insert (and
> consistent with standard container insert).

same comment about insertion semantics as above.

>
>
>  > b2. If there's an exception (by some comparison predicate),
> delete the
>  > modified
>  > element and rethrow.
>
> Okay, this seems reasonable, and would be consistent with insert.

So, to sum it up, if we both agree that insertion will not overwrite (as
it is the semantics of STL assoc. containers), then is this approach
for modify OK with you?

I've included the new memfun modify following your indications and those
of Fernando, and it works right. It'll go in next release, which I hope it'll be
submitted for preformal review.

[...]

>
> There are various advantages to separating key extraction from key
> comparison:
>
> 1) It allows the use of more generic key extraction and key comparison
> functors.  Specifically, for an ordered index, the key comparison
> function in many cases would be std::less, and can default to
> that.  For "unordered" (hash-table-like) indices, it is necessary, at
> least internally, to separate key comparison and key extraction because
> the hash function must also operate on the "key" value; otherwise, the
> hash function must also be very specialized, when in many cases a
> default library-provided hash function can be used (and thus not specified).

A small utility class named less_by is used for precisely this purpose. You can
see it in action in the example accompanying the library. As for hashing, an
analogous device can be written (when hashes get included).

>
>
> 2) It is necessary in order to support certain convenient interfaces;
> specifically, I think it is very convenient to allow the syntax:
> table[key], find(key), and equal_range(key).  In order for the front-end
> to deteremine which index corresponds to the key type, it must "know"
> about keys.

find(key) and related search memfuns are already supported in the way
you suggest! Look for "special lookup operations" in the documentation.
Matter of fact, the operations provided are even more powerful in that
they allow for alternative arguments and comparison predicates.

[...]

>
>
> Couldn't a `composite index' be supported by a key which is a tuple of
> some number of references?

I guess you're taking inspiration here from your own library's design. This
implementation of composite indices simply doesn't fit here. You can consider
things are "the other way around" with respect to indices and dependent keys:
every index is composite. Those that happen to depend on a single member
of the element can be conveniently built by resorting to less_by. I'm
not sure I'm being clear enough. If not, please tell me so.

On a side note, taking composite indices as indices over a tuple of
members does not cover the full spectrum of what an index can do. Imagine an
index whose associated comparison predicate depend on the result of an
external function, say

int f(const element& e);

f

[boost] Re: Interest in multiindex_set?(again)

2003-07-09 Thread Daryle Walker
On Tuesday, July 8, 2003, at 12:25 AM, Joaquín Muñoz wrote:

[SNIP]
A related question: Should boost::multiindex::multiindex_set be raised 
into Boost namespace as boost::multiindex_set (or whatever its final 
name)? Seems the standard practice, but I think it is safer to ask 
first.
[TRUNCATE]

If the library will be insanely huge, like Regex, Graph, or Spirit, 
then it could go in its own namespace.

If libraries of a similar domain are likely to be made (or have already 
been made), then all of those libraries could share a namespace (like 
Math, Numeric, I/O).

If both of these conditions exist, then you could create your namespace 
within the domain namespace.

Else, the library is fairly small and unique, so it could go in the top 
Boost namespace.

(Didn't someone propose a "container" namespace?  Maybe this class 
template can go there.)

Daryle

___
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


[boost] Re: Interest in multiindex_set?(again)

2003-07-09 Thread Jeremy Maitin-Shepard
Joaquín Mª López Muñoz wrote:

>
> This is a no-no policy. Collision can happen with more than one element.
> Following this approach could result in an single update sweeping off
> half the elements of the container :) I don't think users of the 
library want
> this.

What are the current semantics of your insert?  In order to be 
consistent with std::map, the behavior has to be to overwrite (delete) 
an existing element with the same key.  The number of elements deleted 
by an insert or modify or update operation under the semantics I suggest 
is at most the number of unique indices.  These are the semantics I 
would expect anyway, although I suppose it is a matter of opinion.

> I think the analysis is not precise enough. In the modify approach you're
> proposing, modifying (by way of the user-provided functor) takes 
place *before
> anything*. If this functor throws, we got nothing else to do but let 
the exception
>
> propagate. It is the responsibility of the user to guarantee that the 
modyfing
> functor does not leave things in a bad state.
> An acceptable (IMHO) approach for modify would be as follows:
>
> a. Call the user-provided modifying functor. If it throws, let the 
exception
> propagate.

Okay, now that I reconsider, this seems most consistent with standard 
containers, since it follows the idiom of "exception thrown means 
nothing happened."

> b. Attempt reordering.
> b1. If there's a collision, delete the modified element (this can 
be done in
> a no-throw way) and return false

I still think it is better to delete the older elements rather than the 
new one, consistent with my proposed semantics for insert (and 
consistent with standard container insert).

> b2. If there's an exception (by some comparison predicate), 
delete the
> modified
> element and rethrow.

Okay, this seems reasonable, and would be consistent with insert.

> I'm strongly against key extraction: it complicates the instantiation 
of the
> container, has no apparent (to me) usefulness and bans the use of 
composite
> indices. Single-member indices are well served by the less_by utility.
> Maybe someone can say something in favor of key extraction: otherwise,
> it won't get in.

There are various advantages to separating key extraction from key 
comparison:

1) It allows the use of more generic key extraction and key comparison 
functors.  Specifically, for an ordered index, the key comparison 
function in many cases would be std::less, and can default to 
that.  For "unordered" (hash-table-like) indices, it is necessary, at 
least internally, to separate key comparison and key extraction because 
the hash function must also operate on the "key" value; otherwise, the 
hash function must also be very specialized, when in many cases a 
default library-provided hash function can be used (and thus not specified).

2) It is necessary in order to support certain convenient interfaces; 
specifically, I think it is very convenient to allow the syntax:
table[key], find(key), and equal_range(key).  In order for the front-end 
to deteremine which index corresponds to the key type, it must "know" 
about keys.

Note that having key extractors does not prohibit using a comarison 
functor that is integrated with key extraction, since the formal key 
extractor can be specified as an identity functor, thus the key_type is 
the value_type.

Couldn't a `composite index' be supported by a key which is a tuple of 
some number of references?

> As for the non-const iterator, I don't really know what the status of 
this
> proposal for std::set is (can someone provide some pointer to info?)
> My feeling is that if modify is finally provided, its computational 
cost in
> case 3 is acceptable (it adds 2N comparisons where N is the number of
> indices). So I wouldn't pursue the non-const iterator approach  (and as
> always you can resort to const_casting).

http://std.dkuug.dk/jtc1/sc22/wg21/docs/lwg-defects.html#103

I rechecked the status of the proposal, and it appears that the LWG has 
decided to clarify that both set iterator and const_iterator are 
constant.  The text describing the defect (not the resolution) is 
somewhat inconsistent with the actual resolution, and it appears that 
the defect report initially proposed as the resolution requiring a 
non-constant set iterator, but the LWG later changed the resolution to 
the one upon which they decided.

Thus it appears your decision to have only constant iterators is correct.

It occurs to me that a mechanism for the user to specify that reordering 
may be required in only certain indices as a result of a modify 
operation.  This syntax could be used:

table.modify(it, functor); // all indices are updated
table.modify<0, 1, 3>(it, functor); // indices 0, 1, and 3 are updated
table.modify(it, functor); // indices with tag1 are updated
table.modify(it, functor); // indices with either tag1 or
   // tag2 are updated
(The tagging m

Re: [boost] Re: Interest in multiindex_set?(again)

2003-07-09 Thread Joaquín Mª López Muñoz


Jeremy Maitin-Shepard ha escrito:

[...]

>
> How about indexed_table?  This container is *not* a set, since there can
> be duplicates, but it *does* resemble a relational database table.
>
> Then it can be defined in namespace boost::table, and additionally
> promoted to namespace boost.

Seems good to me, it'd be great to know others' opinion.

[...]

>
> Actually, using partial specialization, you can avoid re-specifying the
> default/unique.  multiple would possibly be a better name than
> non_unique.  Additionally, I would recommend allowing a specification of
> the default "uniqueness" of the indices.

Seems good too :) I think I got an idea of how to implement this, I'll put
myself to work on it.

>
>
>  > (I plan to add hash indices sometime by the end of 2003).
>
> A policy-based approach seems the best way to support different types of
> indices.  Currently, I am working on a policy-based equivalent to this
> class which will support an "ordered" (binary search tree-based) index
> and an "unordered" (hash-table-based) index, as well as any user-defined
> index policies (which would allow a different hash table or tree
> implementation).  This should be completed within a week or two.
>

Do you intend to submit that to Boost? If so, then there's some collision
between the two libraries, maybe we can cooperate somehow. If not, maybe
you can lend some ideas to (re-baptised) indexed_table. Please drop me a
mail if you feel like discussing this.

>
>  > I don't think it is a question of how smartly update() is coded, but an
>  > actual conceptual impossibility. If we are to provide strong
> exception-safety,
>  > then no fail or throw is permitted after the element has changed its
> value.
>  > So, there are three scenarios:
>  >
>  > 1. Updating can possibly fail (because of collision with some other
> element) or
>  >
>  > reorder the updated element: no other soluction but to incur a copy
> of the
>  > element, as the actual updating has to take place *after* the
> reordering is
>  > done.
>
> It seems the best behavior in the case of a collision is to delete the
> old element with which there is a collision.  This would be consistent
> with the semantics of insert.

This is a no-no policy. Collision can happen with more than one element.
Following this approach could result in an single update sweeping off
half the elements of the container :) I don't think users of the library want
this.
I think current update() has some value as it is now. I keep discussing
the alternate memfun modify you propose below.

>
>
> Instead of an update method, a modify method that takes an iterator and
> a one-argument (a reference to value_type) functor which it executes on
> the value to which the iterator points.  Combined with boost.lambda and
> boost.bind, this system can be very convenient.

This is similar to what Fernando Cacciola proposed some days ago.

>
>
> In the case that the modification functor (or copy constructor, in the
> case of the existing update method) throws an exception, the semantics
> could be one of the following:
>
> 1. The container is left in an undefined state, and the exception is
> propogated to the caller.
>
> 2. The exception is caught, reordering is attempted; if an exception is
> thrown while reordering, by one of the comparison, etc. functors, the
> container is left in an undefined state and the exception is propogated
> to the caller; if reordering succeeds, the original exception thrown the
> by modification functor is rethrown to the caller.  Thus, in either case
> an exception is propogated to the caller, and by the type of the
> exception whether the container is in an undefined state.
>
> 3. Same as (2), except that in the case that the reordering filas,
> before the exception thrown by a comparison function is propogated to
> the caller, the would-be-modified element is erased from the container,
> leaving it in a well-defined state.  The exception thrown by the
> comparison function is still propogated to the user.
>
> (3) seems like the most desirable solution, thus that is the behavior I
> would recommend.

I think the analysis is not precise enough. In the modify approach you're
proposing, modifying (by way of the user-provided functor) takes place *before
anything*. If this functor throws, we got nothing else to do but let the exception

propagate. It is the responsibility of the user to guarantee that the modyfing
functor does not leave things in a bad state.
An acceptable (IMHO) approach for modify would be as follows:

a. Call the user-provided modifying functor. If it throws, let the exception
propagate.
b. Attempt reordering.
b1. If there's a collision, delete the modified element (this can be done in
a no-throw way) and return false
b2. If there's an exception (by some comparison predicate), delete the
modified
element and rethrow.

What dou you think about it?

[...]

>
>  > 3. Updating does not alter the order of the updated elemen

[boost] Re: Interest in multiindex_set?(again)

2003-07-08 Thread Jeremy Maitin-Shepard
Joaquín Mª López Muñoz wrote:

> If we forget about the namespace name, then I have no objections against
> indexed_set (though std::sets are indexed by nature, but this is 
probably not
> a common perception between users).
> I thought long and hard about name candidates and come up with none 
except
> multiindex_set.
> If no one says anything against it, I'll change it to indexed_set. 
The problem
> remains
> of how to name its associated namespace.

How about indexed_table?  This container is *not* a set, since there can 
be duplicates, but it *does* resemble a relational database table.

Then it can be defined in namespace boost::table, and additionally 
promoted to namespace boost.

>> I don't see any acceptable (non-invasive) way to implement marking 
only one
>> class. But the unique/non_unique two template approach seems perfectly
>> acceptable. You could even argue that it forces the user to consciously
>> make  the critical unique/non_unique decision for each index, so may 
reduce
>> user error.
>
>
> I agree with you. The all-predicates-are-marked approach is even 
better in the
> light
> of future change

Actually, using partial specialization, you can avoid re-specifying the 
default/unique.  multiple would possibly be a better name than 
non_unique.  Additionally, I would recommend allowing a specification of 
the default "uniqueness" of the indices.

> (I plan to add hash indices sometime by the end of 2003).

A policy-based approach seems the best way to support different types of 
indices.  Currently, I am working on a policy-based equivalent to this 
class which will support an "ordered" (binary search tree-based) index 
and an "unordered" (hash-table-based) index, as well as any user-defined 
index policies (which would allow a different hash table or tree 
implementation).  This should be completed within a week or two.

> I don't think it is a question of how smartly update() is coded, but an
> actual conceptual impossibility. If we are to provide strong 
exception-safety,
> then no fail or throw is permitted after the element has changed its 
value.
> So, there are three scenarios:
>
> 1. Updating can possibly fail (because of collision with some other 
element) or
>
> reorder the updated element: no other soluction but to incur a copy 
of the
> element, as the actual updating has to take place *after* the 
reordering is
> done.

It seems the best behavior in the case of a collision is to delete the 
old element with which there is a collision.  This would be consistent 
with the semantics of insert.

Instead of an update method, a modify method that takes an iterator and 
a one-argument (a reference to value_type) functor which it executes on 
the value to which the iterator points.  Combined with boost.lambda and 
boost.bind, this system can be very convenient.

In the case that the modification functor (or copy constructor, in the 
case of the existing update method) throws an exception, the semantics 
could be one of the following:

1. The container is left in an undefined state, and the exception is 
propogated to the caller.

2. The exception is caught, reordering is attempted; if an exception is 
thrown while reordering, by one of the comparison, etc. functors, the 
container is left in an undefined state and the exception is propogated 
to the caller; if reordering succeeds, the original exception thrown the 
by modification functor is rethrown to the caller.  Thus, in either case 
an exception is propogated to the caller, and by the type of the 
exception whether the container is in an undefined state.

3. Same as (2), except that in the case that the reordering filas, 
before the exception thrown by a comparison function is propogated to 
the caller, the would-be-modified element is erased from the container, 
leaving it in a well-defined state.  The exception thrown by the 
comparison function is still propogated to the user.

(3) seems like the most desirable solution, thus that is the behavior I 
would recommend.

> 2. Updating won't fail (because the progammer knows it) but can result in
> reordering. Here a no-throw update is possible that doesn't require an
> additional copy.
In this case, the modify method described above is sufficient, and the 
exception behavior is irrelevant.

> 3. Updating does not alter the order of the updated element (because the
> programmer knows it). One can just change the value in-place and forget
> about it.
A non-const iterator should be provided. (I believe this is what has 
been proposed for std::set)  Requiring that const_cast is used to obtain 
a non-const value from an iterator seems like too strong a warning.  In 
the case where the key for an index (assuming you separate key 
extraction from key comparison as has been recommended) is a particular 
member of the value, it is often rather obvious whether a particular 
operation will or will not modify the key.

---
Jeremy Maitin-Shepard



Re: [boost] Re: Interest in multiindex_set?(again)

2003-07-08 Thread Joaquín Mª López Muñoz
Beman Dawes ha escrito:

[...]

> I'm more interested in the class name than the namespace name. One problem
> at a time. If you weren't worrying about the namespace name, would you then
> like indexed_set as the class name? What are some other alternatives?

If we forget about the namespace name, then I have no objections against
indexed_set (though std::sets are indexed by nature, but this is probably not
a common perception between users).
I thought long and hard about name candidates and come up with none except
multiindex_set.
If no one says anything against it, I'll change it to indexed_set. The problem
remains
of how to name its associated namespace.

[...]

>
> Yes, I think so. It is a bit of a judgement call, but since the library is
> a very general purpose tool, and assuming the library only adds a few names
> to the namespace it lives in, I'd rather see it in namespace boost. Others
> may disagree; a lot of people like tall and deep namespace hierarchies even
> when the components are very general purpose.

OK, next installment will have the container promoted to boost::indexed_set.
There are some auxiliary classes which I don't know exactly where to put, I'll
think it over.

> I don't see any acceptable (non-invasive) way to implement marking only one
> class. But the unique/non_unique two template approach seems perfectly
> acceptable. You could even argue that it forces the user to consciously
> make  the critical unique/non_unique decision for each index, so may reduce
> user error.

I agree with you. The all-predicates-are-marked approach is even better in the
light
of future change (I plan to add hash indices sometime by the end of 2003).

[...]

>
> I understand the rationale for that, but am still concerned. There are many
> applications where the cost of a copy to apply a minor update is
> prohibitive. The user is then forced to add a layer of indirection, or use
> a home-made container. Ugh. This is one of the few cases I run into where a
> safe design is so inefficient compared to a traditional unsafe design that
> in real production code I might be forced to use the unsafe design. That
> makes me uncomfortable.

I don't think it is a question of how smartly update() is coded, but an
actual conceptual impossibility. If we are to provide strong exception-safety,
then no fail or throw is permitted after the element has changed its value.
So, there are three scenarios:

1. Updating can possibly fail (because of collision with some other element) or

reorder the updated element: no other soluction but to incur a copy of the
element, as the actual updating has to take place *after* the reordering is
done.
2. Updating won't fail (because the progammer knows it) but can result in
reordering. Here a no-throw update is possible that doesn't require an
additional copy.
3. Updating does not alter the order of the updated element (because the
programmer knows it). One can just change the value in-place and forget
about it.

1 is taken care of by current multiindex_set::update. 3 can be solved with
appropriate const_casting: this is dangerous as any cast is, but then again
the programmer *knows* what she's doing. Furthermore, this is exactly
the same situation as in std::set. We are left with 2: please note that
it is extremely error-prone to trust the programmer in her implicit knowing
that no collision will happen (checking 3, on the other hand, is much
easier, as probably what the programmer is changing are auxiliary members
that no comparison predicate depend on). Now, if the no-collision
guarantee is assumed, the updating can be performed without throwing.

To sum it up, as it stands now mutiindex_set allows for updating in
scenarios 1 and 3. Scenario 2 (programmer knows the updated element
won't clash with some other element) can be realized (Fernando suggested
a nice way to do it), but my opinion is that this facility would be extremely
dangerous to handle. I'd rather leave things as they are now, but I'm
open to suggestions, maybe I'm missing something obvious that will solve
the whole problem in a fell swoop.

Open issues:

* Name will change to indexed_set if no one complains. What name to
use for the associated namespace?
* Is it advisable to provide an no-copy update facility relying on the
assumption that updating won't clash with other elements in the set?

Joaquín M López Muñoz
Telefónica, Investigación y Desarrollo

___
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Re: [boost] Re: Interest in multiindex_set?(again)

2003-07-08 Thread Beman Dawes
At 06:25 PM 7/7/2003, =?windows-1252?Q?JOAQUIN_LOPEZ_MU=3FZ?= wrote:

>Hi Beman,
>
>- Mensaje Original -
>[...]
>> *  The "multiindex_set" name seems awkward to me. Maybe
>> "indexed_set" or
>> "set_index"?
>
>I don't like the name either, and would be happy if someone comes
>with something better. Nevertheless, I don't think indexed_set is a
>good choice: when picking up a name another one must be chosen
>for the associated namespace (and it is convenient that these two are
>not the same, for reasons explained vg in Boost.Tuple docs). I
>chose multiindex_set/multiindex. The analogous indexed_set/index
>does not fit: index is too broad a denomination for the namespace.
I'm more interested in the class name than the namespace name. One problem 
at a time. If you weren't worrying about the namespace name, would you then 
like indexed_set as the class name? What are some other alternatives?

>A related question: Should boost::multiindex::multiindex_set be
>raised into Boost namespace as boost::multiindex_set (or whatever
>its final name)? Seems the standard practice, but I think it is safer
>to ask first.
Yes, I think so. It is a bit of a judgement call, but since the library is 
a very general purpose tool, and assuming the library only adds a few names 
to the namespace it lives in, I'd rather see it in namespace boost. Others 
may disagree; a lot of people like tall and deep namespace hierarchies even 
when the components are very general purpose.

>>   typedef boost::multiindex::multiindex_set<
>> employee,
>> boost::tuple< unique< employee::comp_id >,
>>   non_unique< employee::comp_name >,
>>   unique< employee_name::comp_ssnumber > >,
>> > employee_set;
>>
>
>This is definitely a good idea, Fernando also hinted at symplifing
>the specification of unique/non-unique indices. Any suggestion on how
>to implement it? With two templates for unique and non-unique indices
>the thing is simple; it is not as simple when only one class is marked.
>Suggestions here are most welcome.
I don't see any acceptable (non-invasive) way to implement marking only one 
class. But the unique/non_unique two template approach seems perfectly 
acceptable. You could even argue that it forces the user to consciously 
make  the critical unique/non_unique decision for each index, so may reduce 
user error.

>[...]
>> * It isn't clear (or I missed it) if iterators are always
>> logically
>> const-iterators or not. In other words, could your example have
>> been
>> written like this?
>>
>>   typedef index_type::type employee_set_by_name;
>>   employee_set_by_name& i=es.get<1>();
>>
>>   employee_set_by_name::iterator it=i.find("Anna Jones");
>>   it->name="Anna Smith"; //she just got married to Calvin Smith
>>
>
>No it couldn't. Iterators and const_iterators (which incidentally are the
>same) behave logically as const-iterators, just the same as in STL sets.
>update() has been carefully crafted so that strong exception-safety is
>provided: for that, it is necessary that a copy of the element is 
provided
>instead of allowing some sort of in-place modification of the element.
>This is discussed in some detail with Fernando somewhere up this thread.

I understand the rationale for that, but am still concerned. There are many 
applications where the cost of a copy to apply a minor update is 
prohibitive. The user is then forced to add a layer of indirection, or use 
a home-made container. Ugh. This is one of the few cases I run into where a 
safe design is so inefficient compared to a traditional unsafe design that 
in real production code I might be forced to use the unsafe design. That 
makes me uncomfortable.

>Thanks for your interest. If no objection is raised I plan to submit
>the library for pre-formal review in a couple of weeks.
Good!

--Beman

___
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Re: [boost] Re: Interest in multiindex_set?(again)

2003-07-07 Thread JOAQUIN LOPEZ MU?Z
Hi Beman,

- Mensaje Original -
[...]
> *  The "multiindex_set" name seems awkward to me. Maybe 
> "indexed_set" or 
> "set_index"?

I don't like the name either, and would be happy if someone comes
with something better. Nevertheless, I don't think indexed_set is a
good choice: when picking up a name another one must be chosen
for the associated namespace (and it is convenient that these two are
not the same, for reasons explained vg in Boost.Tuple docs). I
chose multiindex_set/multiindex. The analogous indexed_set/index
does not fit: index is too broad a denomination for the namespace.

A related question: Should boost::multiindex::multiindex_set be
raised into Boost namespace as boost::multiindex_set (or whatever
its final name)? Seems the standard practice, but I think it is safer
to ask first.

> 
> *  It seems safer to me for the default to be indexes that are unique.
> 
> *  Separating the specification of uniqueness from the list of 
> indices 
> seems like a recipe for mistakes, particularly during maintenance. 
> Is it 
> possible to combine uniqueness specification with the index tuple? 
> Perhaps 
> your example could look like this:
> 
>   typedef boost::multiindex::multiindex_set<
> employee,
> boost::tuple< employee::comp_id,
>   multi< employee::comp_name >,
>   employee_name::comp_ssnumber >,
> > employee_set;
> 
> or this:
> 
>   typedef boost::multiindex::multiindex_set<
> employee,
> boost::tuple< unique< employee::comp_id >,
>   non_unique< employee::comp_name >,
>   unique< employee_name::comp_ssnumber > >,
> > employee_set;
> 

This is definitely a good idea, Fernando also hinted at symplifing
the specification of unique/non-unique indices. Any suggestion on how
to implement it? With two templates for unique and non-unique indices
the thing is simple; it is not as simple when only one class is marked.
Suggestions here are most welcome.

[...]
> * It isn't clear (or I missed it) if iterators are always 
> logically 
> const-iterators or not. In other words, could your example have 
> been 
> written like this?
> 
>   typedef index_type::type employee_set_by_name;
>   employee_set_by_name& i=es.get<1>();
> 
>   employee_set_by_name::iterator it=i.find("Anna Jones");
>   it->name="Anna Smith"; //she just got married to Calvin Smith
> 

No it couldn't. Iterators and const_iterators (which incidentally are the
same) behave logically as const-iterators, just the same as in STL sets.
update() has been carefully crafted so that strong exception-safety is
provided: for that, it is necessary that a copy of the element is provided
instead of allowing some sort of in-place modification of the element.
This is discussed in some detail with Fernando somewhere up this thread.

> Anyhow, I'm glad to see you working on this, and hope you will 
> continue 
> with it.
> 

Thanks for your interest. If no objection is raised I plan to submit
the library for pre-formal review in a couple of weeks.

Regards,

Joaquín M López Muñoz
Telefónica, Investigación y Desarrollo

___
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


[boost] Re: Interest in multiindex_set?(again)

2003-07-07 Thread Beman Dawes
At 03:15 PM 7/7/2003, Joaquín Mª López Muñoz wrote:

>Please find some preliminary documentation for multiindex_set
>at:
>
>http://groups.yahoo.com/group/boost/files/multiindex.zip
Hi Joaquín,

I took a quick look at multiindex.html, and found it quite interesting. It 
appears to meet a common need. Here are some specific (but not carefully 
thought out) comments:

*  The "multiindex_set" name seems awkward to me. Maybe "indexed_set" or 
"set_index"?

*  It seems safer to me for the default to be indexes that are unique.

*  Separating the specification of uniqueness from the list of indices 
seems like a recipe for mistakes, particularly during maintenance. Is it 
possible to combine uniqueness specification with the index tuple? Perhaps 
your example could look like this:

  typedef boost::multiindex::multiindex_set<
employee,
boost::tuple< employee::comp_id,
  multi< employee::comp_name >,
  employee_name::comp_ssnumber >,
> employee_set;
or this:

  typedef boost::multiindex::multiindex_set<
employee,
boost::tuple< unique< employee::comp_id >,
  non_unique< employee::comp_name >,
  unique< employee_name::comp_ssnumber > >,
> employee_set;
* The need for an update member function also comes up in other non-STL 
containers like B-trees, so don't worry that it is an add-on.

* It isn't clear (or I missed it) if iterators are always logically 
const-iterators or not. In other words, could your example have been 
written like this?

  typedef index_type::type employee_set_by_name;
  employee_set_by_name& i=es.get<1>();
  employee_set_by_name::iterator it=i.find("Anna Jones");
  it->name="Anna Smith"; //she just got married to Calvin Smith
Anyhow, I'm glad to see you working on this, and hope you will continue 
with it.

--Beman

___
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Re: [boost] Re: Interest in multiindex_set? (again)

2003-07-03 Thread Joaquín Mª López Muñoz


Arkadiy Vertleyb ha escrito:

[...]

>
> The reason we are interested in the multiindex_set is that we want to
> provide a more efficient implementation for our table, that is currently
> implemented with std::vector, although Ed Brey already suggested an
> alternative with std::set(that we need to examine).
>
> Regards,
>
> Arkadiy
>

I suggest you download multiindex.zip from the files section and play
with it to see if it fits your needs. If you need assistance with the usage of
the library don't hesitate to contact me. Next week some sort of preliminary
documentation  should be prepared, anyway.

Regards,

Joaquín M López Muñoz
Telefónica, Investigación y Desarrollo

___
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


[boost] Re: Interest in multiindex_set? (again)

2003-07-02 Thread Arkadiy Vertleyb
I am just re-posting this response, since my first attempt to Cc it didn't 
work:

The "Relational Template Library" is not a part of Boost (although we would 
eventially like it to be).  We introduced it about a year ago, and got 
little response at that time.  The library then included a relational table 
(capable of storing any objects in its fields in a typesafe manner), and 
basic relational operators.  We were critisized then for the lack of 
indexing.

We added indexing capability about half a year ago.  We had some positive 
response from Boost, but eventially the discussion, again, died away.  
Although a few people seemed to be interested, no one apparently has really 
used the library or tried it out.

We also wrote an article for CUJ, and got positive response from them, too.  
If I understand them correctly, they did accept the article for publication, 
but put it on hold for now...

Right at this moment we are in the middle of the third round of development. 
 We are going to introduce basic transactions.  Along with this also comes 
automatic index rebuild.  That means you can have an arbitrary complicated 
expression, and an index on top of it (we allow to index any expession, not 
only tables), then change any tables this expression is based upon, and have 
the index automatically (and incrementally) updated, so you can call begin() 
on it, and start using it right away (of course iterators, as always, get 
invalidated).

We also met some compiler limitations related to the template tree size, and 
found an interesting way to deal with it.

We are planning to come up with a new version (that would include the above 
capabilities) within the next month or two.

The reason we are interested in the multiindex_set is that we want to 
provide a more efficient implementation for our table, that is currently 
implemented with std::vector, although Ed Brey already suggested an 
alternative with std::set(that we need to examine).

Regards,

Arkadiy

-Original Message-
From: Darren Cook [mailto:[EMAIL PROTECTED]
Sent: Wednesday, July 02, 2003 3:42 AM
To: [EMAIL PROTECTED]
Subject: Re: [boost] Re: Interest in multiindex_set? (again)
In boost mailing list you wrote:
We may use this to provide a better table implementation for the Relational
Template Library, we are currently working on.
Is that "Relational Template Library" part of Boost? (or going to be). Where
can I learn more about what you have planned?
Darren

_
Add photos to your e-mail with MSN 8. Get 2 months FREE*.  
http://join.msn.com/?page=features/featuredemail

___
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


[boost] Re: Interest in multiindex_set? (again)

2003-06-30 Thread Ed Brey
Arkadiy wrote:
> Right now we are using a sorted vector instead of a set, to implement
> our relational tables, because set doesn't allow us to search on a
> prefix of a key.  Like if a table is indexed on a, b, c, we are not
> able to use equal_range on a, b with the set.

I was able to successfully use a set for the indices into an in-memory database I 
wrote for my company a few years back.  By providing a custom Compare object, I was 
able to implement prefix-based seeking using equal_range.  I allowed an element to 
indicate that it is truncated (e.g. contains only a and b, but not c).  The Comparison 
function used the indication when checking whether one element is less than another.


___
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


[boost] Re: Interest in multiindex_set? (again)

2003-06-30 Thread Arkadiy
We may use this to provide a better table implementation for the Relational
Template Library, we are currently working on.

Right now we are using a sorted vector instead of a set, to implement our
relational tables, because set doesn't allow us to search on a prefix of a
key.  Like if a table is indexed on a, b, c, we are not able to use
equal_range on a, b with the set.  Sorted vector allows us to do this, but
sorted vector has obvious problems with inserts/deletes.

Do you think your library can provide both search on a key prefix and fast
inserts/deletes?

Arkadiy




"JOAQUIN LOPEZ MU?Z" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
Hi all,

A few weeks ago I prompted for interest in a multiply
indexed set container, whth alas little response from the
community. In the meanwhile, I continued working in
multiindex_set and now the library is 90% completed,
save docs. It is available at

http://groups.yahoo.com/group/boost/files/multiindex.zip

and have been checked to compile with MSVC++ 6.5 and
gcc 3.2 under cygwin.

IMHO multiindex_set is an useful resource and can help
programmers when they are in need for data structures more
complex than simple std::sets and std::multisets. In these
cases multiindex_set is a more robust solution than the usual
resort to ad hoc constructs based on containers of iterators
and stuff like that.

I am aware the complete lack of docs is a big barrier to the
evaluation of the library, so the following serves as a basic
introduction to multiindex_set that hopefully can motivate the
reader into trying the library. An extensive example of use is
provided in the URL above showing the whole set of capabilities
offered by the container.

* Usage: As std::set takes a comparison predicate to index
the elements contained, multiindex_set is instantiated with a
boost::tuple of such predicates, which we call indices.
For instance, say we have class employee with sorting
predicates based on some id (upon which employe::operator <
is written), name and age. One can construct a set providing
indices to all three orderings like this:

  typedef
multiindex_set<
  employee,
  tuple<
std::less,
less_by,
less_by > >
employee_set;

less_by is a facility intended to help construct a sorting predicate
based on a single member of the class. In general, one can use
whatever predicate with the right signature.
Unless explictly stated the fist index (std::less in our
example) is taken to be *unique*, while all others allow for duplicates.
This means one can insert employees with the same age but not
the same id. multiindex_set allows for specification of zero, one or
more unique indices, much in the same way as a relational table does.
So, the following constructs a multiindex_set with unique indices
for id and social security number:

  typedef
multiindex_set<
  employee,
  tuple<
std::less,
less_by,
less_by,
less_by >,
  unique_indices<0,3> // id and ss number
>
employee_set;

Once constructed, multiindex_set holds as many internal orderings
as indices are specified. Each index has the same interface as a
std::set container. By itself, multiindex_set inherits the functionality
of the first (primary) index.

  employee_set es;
  // retrieves a "view" based on the first index (name)
  employee_set::index_type<1>::type& i1=es.get<1>();

  // ordered by id
  std::cout<<"by id"<(std::cout));

  // ordered by name
  std::cout<<"by name"<(std::cout));

* Special lookup operations: When elements are expensive to
construct, it is a nuissance to have to build a whole object
just to search for a particular member of it. multiindex_set duplicates
the std::set lookup facilities to help you perform searchs without
supplying entire objects:

i1.find("John"); // find John using the name index

For this to work the predicates must be extended to accept the
appropriate subobjects (less_by automatically does this).
One can even pass in "compatible" sorting predicates, where
compatible means "inducing a weaker compatible ordering", as
the following example shows:

i1.lower_bound('J',employee::comp_initial()
// first employee whose name begins with J, using the
// name index and the special employee::comp_initial predicate.

* Update: given an element pointed to by a multiindex_set iterator,
one can change its contents via the update member function:

employee_iterator it=es.find(0); // employee with id==0
employee e=*it;
e.id=100; // change her ID
es.update(it,es); // and update the record.

updating is done efficiently (constant time) if the ordering
of the updated element remains the same with respect to all
indices, and logarithmically for other cases. Interestingly enough,
iterators and references to an element remain valid after updating.
update provides strong exception-safety, in the sense of
Stroustrup's TC++PL sp. ed.

* Implementation: multiindex_set has been written from scatch
with Boost in mind,

[boost] Re: Interest in multiindex_set? (again)

2003-06-30 Thread Fernando Cacciola
This looks fine in general.
I've needed something like it so I'm intereseted on seeing this on boost.

Some issues:

(1)
Why 'index' instead of 'key'? Associative containers use the term key
instead of index,
since index is typically related with random access instead of look up
access.

(2)  I would explore a friendlier interface. For instance:

(2.1)  key uniqueness could be specified along with the key itself
and not with a separate template parameter.

(2.2) keys could be tagged so that 'get' could be parametrized by
key tag instead of number.

(3) This is a showstopper for me: The key mutating operations should
use a higher level interface which protects clustering and ordering
invariants.
With the current interface, if I understood correctly, I could forget to
call update() which would break such invariants.
A better setup would be to use something along the lines of
the 'modifiers' used by the CGAL Library.
Check out the section named "Protected Access to Internal Representations"
on this page:

"http://www.cgal.org/Manual/doc_html/frameset/fsSupport.html";

Best,

Fernando Cacciola







___
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost